The aim of the CXXR project is gradually to refactor (reengineer) the interpreter of the R language, currently written for the most part in C, into C++, whilst as far as possible retaining full functionality. CXXR is being carried out independently of the main R development and maintenance effort.
Note: the CXXR documentation often uses the acronym CR to refer to the standard R interpreter, in contradistinction to CXXR.
It is hoped that by reorganising the code along object-oriented lines, by deploying the tighter code encapsulation that is possible in C++, and by improving the internal documentation, the project will make it easier for researchers to develop experimental versions of the R interpreter. An important subsidiary objective is to create a variant of R with built-in facilities for provenance tracking, so that for any R data object it will be possible to determine exactly which original data files it was derived from, and exactly which sequence of operations was used to produce it: if you remember the old S AUDIT facility, you will probably know how useful this can be.
C++, though perhaps somewhat unfashionable, is a strongly-typed language with a powerful range of facilities for object-oriented programming. In its design, constant attention has been paid to providing a smooth conversion pathway from C. Compilers, including free compilers, are readily available, and the language is well standardised. The current standard is ISO14882:2003, but the objective in CXXR is to require only that the compiler be able to cope with code conforming to the earlier standard, ISO14882:1998. And last but not least, it is a language that I have had years of experience with (though always learning more!).
Maybe you're right: if you have the time and the expertise, go right ahead!
CellPool
, MemoryBank
and Allocator
look after memory allocation; GCManager
,
GCNode
, GCRoot
and WeakRef
look
after garbage collection. (All CXXR classes are within the namespace CXXR
.)
Garbage collection is now based primarily on reference counting, with a
(non-generational) mark-sweep algorithm as a backstop.SEXPREC
union of CR has been converted into an
extensible hierarchy of classes rooted at a class RObject
(which inherits from GCNode
). The functionality of duplicate1()
(in CR's file duplicate.c
) has been reimplemented using
class copy constructors and a virtual function RObject::clone()
.
Code associated with a particular R data type is progressively being
shifted into the relevant class, and C++'s public/protected/private
access controls used to defend class invariants.RObject
hierarchy can apply its own
checks on how attributes are set, and override the default way in which
attribute values are stored internally.Frame
as the fundamental building
block. Facilities such as those provided by the package RObjectTables
can now be implemented more simply by inheriting from Frame
.
Hooks have been provided for monitoring the reading or writing of symbol
bindings within environments.RCNTXT)
have been separated and refactored using a variety of mechanisms. In
particular, indirect flows of control are now much more in line with C++
idioms, in particular in relying on object destructors to restore
necessary state as the stack is unwound.$(R_HOME)/include/CXXR
API. For example R's subscripting operations (subsetting and
subassignment) are now carried out by algorithms implemented as C++
templates, so that they are applicable to generalised vectors of
arbitrary element types, not just the R built-in vector types.See the refactoring history for more information.
Certainly, most readily by trying out CXXR and reporting any bugs you find. Beware however that if you come across program faults, CXXR is likely to abort gracelessly without saving your work! (Control-C will also abort the interpreter at present.) Testing in a non-English locale would be particularly welcome.
If you want to contribute to coding, experience specifically of C++ would
be
a definite advantage: unfortunately, good C programmers tend to make bad
C++
programmers (and vice versa); Java likewise. I would
particularly
welcome help in porting CXXR to platforms other than Linux, particularly
Microsoft Windows (using mingw etc.).
My contact email is at the foot of this page.
CXXR would obviously not have been feasible without the work of the R core team in developing and maintaining R itself. The overwhelming majority of the code in CXXR is lifted directly from R (under the terms of the GNU General Public Licence). But equally important is the excellent test suite that the R team has developed, and to which I hope CXXR will in due course be able to contribute.
Particular thanks are owed to the following (in alphabetical order):