Ideas for Refactorings, August 2003
Simon Thompson
Structural
-
Swap Arguments
-
Swap the position of two arguments of a function. Requires all
argument lists to be swapped too. Can cause problems with partial
applications.
It could be a special case of a Permute Arguments
refactoring; on the other hand, all permutations can be given
by a sequence of 2-cycles.
-
Permute Arguments
-
Permute the arguments of a function. Requires all
argument lists to be permuted too. Can cause problems with partial
applications.
-
Curry/uncurry Arguments
-
A subsequence of n arguments of a function is selected for
conversion to an n-tuple. Will require the conversion of all
argument lists. Uncurrying can cause problems with partial
applications.
-
Constant becomes an argument
-
Transform the use of a constant within a definition into an argument,
with the constant supplied as the value of the argument at all call
sites (outside the definition itself).
This is a variant of add argument.
-
Unfolding (beware!)
-
It's not always possible to unfold in Haskell: concern about
overloaded
identifiers.
-
Convert between
let
and where
-
Right to left is easier than left to right.
-
Introduce pattern match over an argument position
-
For a function with a variable in a particular argument position in all
its defining equations, replace the variable by an exhaustive set of
patterns over the type of the variable.
-
Replace pattern matching equations by a single equation containing a
case
statement
-
To replace a multi-argument pattern match by a single
case
expression requires that the function is first
uncurried; alternatively, a series of nested case
statements, one per argument, will be required.
The reverse is possible only if the case
switch is over
a variable (or a collection of nested cases
of the right
form).
-
Replace guards with
if ... then ... else ...
-
Requires there to be an
otherwise
case in the guard. No
restriction in the reverse direction.
-
Coalesce equations
-
This can follow the introduction of field names for an algebraic type;
common field names for fields in different summands allow similar
behaviour in the different summands to be expressed in terms of the
field names. (Example: all the binary constructors of a type of
propositions have a left and right-hand argument; in calculating the
size of the expression that's all we need to express the calculation.)
Note: related to simple/layered data types.
-
Introduce
class
and instance
-
This can be done by identifying a type and a collection of functions
over that type.
-
Introduce overloading
-
Identify a type, a
class
definition and a collection of
functions over that type which are to become the body of the
instance
declaration. See also Rename functions as a
class instance.
-
Rename functions as a class instance
-
Existing functions are identified as providing the instance of a class
over a particular type. All their uses have to be modified to use the
renamed functions, and the functions that use them will (potentially)
have their types generalised, to work over the class whose instance has
just been introduced. See also Introduce overloading.
-
Replace explicit recursion by application of a recursion combinator
-
This is an example of folding (in the Burstall/Darlington
sense) against the application of a higher-order function.
-
Introduce composed function
-
The scenario is one in which
f
and g
are
defined separately, yet everywhere used in the combination f.g
.
A new function h = f.g
is introduced; further
transformations (see below) allow the definition of h
directly, without using the definitions of f
and
g
.
The inverse is split definition below.
-
Function/Value level definition
-
Transform a function-level
definition like
h = f.g
into one with
paramter(s): h x = (f.g) x
, or the reverse.
-
Unfold according to a definition
-
Given the scenario
h x = ... f x ...
it is possible to unfold the definition of h
according to
the definition of f
, which might perform the pattern
matches []
and (x:xs)
on x
giving
the two equations
h [] = ...
and
h (y:ys)@x = ...
Could this be achieved by transforming f
's definition into
one using case
before unfolding the function application
f x
, and subsequently turning the case
split
back into a set of pattern matching equations? Answer: probably yes.
-
Name duplicated sub-expressions
-
(Composite refactoring) Give a name for the sub expression and identify
each of its uses (verifying that they are all identical). This can be
given by introduce definition followed by a sequence of
foldings.
-
Introduce HOF
-
It's common to write a first-order definition and then to make it
higher-order by lifting out a particular piece of functionality into a
functional argument. An example is given by
trans (x:xs) = x+2 : trans
xs
.
Selecting the sub expression x+2
, which has a single free
variable, will cause the functional parameter to be unary, and to be
instantiated to (+2)
at all calls of the (original)
trans
.
If the chosen sub-expression has n free variables, then the functional
parameter is n-ary. See also Introduce a function with
overloading.
-
Introduce a function with overloading
-
The common functionality in two definitions
fi = ... gi ...
is rendered in a single function f = ... g ...
and two
instance declarations g = gi
for a class with
g
in its interface.
Note that this is an alternative to Introduce HOF where the
g
becomes a parameter.
-
Eliminate duplicate function definitions
-
Two or more functions are identical. Eliminate all but one, renaming all
uses of the others to the function that remains.
-
-
Modules
-
Split
module
-
A single module is split into two. Potentially control the
interfaces of the two new modules (rather than adopting the default
export and import options).
-
Move a declaration between
modules
-
Will need to amend
import
and export
statements.
Types
-
Name a type using
type
-
Identify uses of a type in a particular way by making them instances
of a type synonym. This has no semantic effect on the program.
Note that type synonyms cannot be the subject of
instance
declarations.
-
Name a type using
data
or newtype
-
Names introduced in this way change the meaning of the program,
since the old and new types are different. The
constructor of the type has to be introduced or removed to convert
between the original and the new type.
-
Add/remove field names in a
data
type
-
Field names give names to selector functions. If field
names are removed, then the corresponding selector functions have to
be defined explicitly.
See also Introduce Abstract Data Type.
-
Add/remove discriminator functions for a
data
type.
-
Discriminator functions decide which part of a sum a value belongs
to. A canonical naming scheme calls the discriminators
isBlah
where Blah
is the corresponding
constructor.
See also Introduce Abstract Data Type.
-
Add/remove explicit constructor functions for a
data
type.
-
That is, add a function
mkBlah
with the same signature
as the constructor Blah
.
See also Introduce Abstract Data Type.
-
Introduce/eliminate Pattern Matching.
-
Given a concrete
data
type, introduce
field names and discriminator functions, as above.
It is then possible to eliminate pattern matching in favour of
selectors and discriminators.
-
Introduce/eliminate an abstract data type (ADT).
-
First Eliminate Pattern Matching as above. Then introduce
explicit constructor functions to replace constructors.
The interface of the ADT inlcudes selectors, discriminators and
constructor functions. Other functions selected by the user; see
Migrate Functionality below.
-
Migrate Functionality through Interface
-
A function is moved across an ADT interface. A function defined
'outside' the implementation capsule can be moved inside (and then
reimplemented in a more efficient way); a function from inside the
capsule can be moved outside, if its definition only uses functions
in the ADT interface.
-
Replace function by constructor
-
A constructor (with a small 'c') function over an algebraic type is
replaced by a Constructor. This allows pattern-matching definitions
to take account of this particular case, but on the other hand
forces all definitions to have this extra case.
-
Convert between algebraic and existential types.
-
This trades off pattern matching (on the elgebraic type) against
extensibility, à la OOP. Will not work well with
binary methods.
This can be seen as a sequence
of simpler refactorings, including splitting the sum into its
constituent parts, introducing an interface (class
) for the
functions over the types and so forth.
-
Simple vs layered
data
types
-
Best seen in the context of an example. In representing a type of
arithmetic expressions, can either have one expression constructor
per operator (simple), or an expression constructor which has an
extra field from an enumerated type which represents operators.
-
Change implementation of an ADT
-
In the most general sense, this will require that a semantically
equivalent implementation is produced. One might implement sets by
lists, or other means.
A specific example comes from the memoisation of a value within
the implementation of a data type: e.g. an extra field in a tree which
records the depth of a tree. This sort of memoisation can be introduced
automatically.
-
Replace a finite set of constants with enumerated type
-
Using a fixed set of constant values to represent a finite set of cases
can be replaced with values from an enumerated type.
-
Change in type annotation
-
It is possible to give more specific or more general type annotations
for existing definitions. Constraints will be placed by the context and
use of the definition in question.
There's also the issue of moving between Int
and
Integer
.
-
Type change: convert
Maybe
to List
or
to Either
, or Bool
to
Maybe
.
-
This conversion forces all definitions over the type in question to be
modified.
-
Type change: convert between tuples and (one constructor)
algebraic types; between tuples and (homogeneous) lists.
-
This conversion forces all definitions over the type in question to be
modified.
More complex refactorings ... challenges?
-
Memoise
-
Memoise the values of a functions. This would be very neat.
equivalent implementation is produced. One might implement sets by
lists, or other means.
A specific example comes from the memoisation of a value within
the implementation of a data type: e.g. an extra field in a tree which
records the depth of a tree.
-
Deforestation
-
Remove intermediate lists from lazy functional programs.
Note: Olaf Chitil has worked in this area.
-
Split definition
-
A function does two things: for instance, it calculates a value and
formats it in some way. This refactoring should turn the original
function into the composition of the two distinct operations.
Could the place to split be indicated by its type? Could
this in some sense be seen as a reforestation transformation?
-
Introduce
Monad
-
From an instance of computation within a particular monad, introduce an
explicitly monadic computation.
Alternatively, it would be possible to introduce a sequentialised
(monadic)
computation of any particular expression.
-
Introduce exception handling
-
This can be introduced instead of using the
error
function,
or it can be introduced ab initio.
Not quite refactorings
-
Function template
-
From a given definition of a function, extract a template for defining
a new function.
Similarly, could build a template from the form of a data
type or types.
-
Check invariants and pre-/post-conditions
-
Is this strictly a refactoring? Automatically add checks for certain
conditions to every function call, or data invariance checks for an ADT.
-
Modify
data
type
-
Various possibilities here:
- Add a constructor (and modify all definitions over the
type).
- Add a field to (a subset of) the constructors of a data type
(and modify all definitions over the type).
Perhaps these are refactorings?
Last modified 9/8/03