The fold functions in functional languages are very powerful constructs, but how about when you want a different function applied to the first element, or to the last? What about when the sequence is empty? Rather than solving this problem repeatedly, I have written a few functions to cover the range of cases. As I wrote these, I found it interesting that the solutions for Array, Seq and List all look very different.

Here is a simple scenario. I have a list of items that I want to convert to strings, and I also want to put a separator between them. Importantly, a trailing separator must be avoided. (Note: this is a very simplistic case of a wider range of problems). To do this with fold means writing extra functionality to track whether a separator is needed. In general, you need something like a continuation function. To provide generic functionality, I believe you would end up writing something like I have yourself.

**Some library functions**

To support the implementation of the generic fold (as I have called it), *gfold*, I have the following library functions:

let abort<'T,'U> (_:'T) : 'U = failwith "ABORT!" let combineLeft2 fx fy a x y = fy (fx a x) y let combineRight2 fx fy a x y = fy y (fx x a)

*Note: I do not use combineRight2 but I have included it for completeness.*

**abort** is a function that can be used in the place of any function where the execution is always a logic error. I could just use *failwith* but the advantages of *abort* are that it is a function that can be used directly and that it makes it clear that this is not a runtime error caused by an invalid operation but that it is a logic error in the program itself. I use *abort* to handle pattern matches that are not relevant and cannot occur.

**combineLeft2** combines two functions that each take two parameters and where the first input to the second function is the result of applying the first function. This may seem obscure, but the standard *fold* functions take a function that takes a state value as the initial parameter and then an enumerated item as a second parameter. Applying this type of function to two items in a list, *[x1;x2]*, is as follows:

f (f s x1) x2

However, if the two functions are different then:

f2 (f1 s x1) x2

Which is equivalent to:

let f f1 f2 s x1 x2 = f2 (f1 s x1) x2) in f s x1 x2

Or more simply:

(combineLeft2 f1 f2) s x1 x2

And so *combineLeft2 f1 f2* results in a single function value that takes three curried arguments.

**Generic fold**

The basic goal behind *gfold* is to allow a different function to be applied to the first, middle and last elements. However, this raises the question of what to do with only one or two items in the sequence. Also, it is usual for *fold* to return its initial state on an empty sequence, but unless you track the count of items there is no way of detecting whether the sequence is empty using *fold* itself (although each type of sequence has its own functions for doing this). So, my generic fold has functions to support each of these boundary cases:

let gfold fZero fOne fTwo fFirst fMiddle fLast s sequence

The parameters have the following types:

*fZero : ‘a -> ‘b**fOne : ‘a -> ‘c -> ‘b**fTwo : ‘a -> ‘c -> ‘c -> ‘b**fFirst : ‘a -> ‘c -> ‘d**fMiddle : ‘d -> ‘c -> ‘d**fLast : ‘d -> ‘c -> ‘b**s : ‘a**sequence : ‘c seq***or**‘c list**or**‘c array**result**: ‘b

**List implementation**

The implementation of generic fold on a list is the most intuitive:

let gfold fZero fOne fTwo fFirst fMiddle fLast s list = match list with | [] -> fZero s | [x] -> fOne s x | [x;y] -> fTwo s x y | x::xs -> let rec inner s list = match list with | [x] -> fLast s x | x::xs -> inner (fMiddle s x) xs | [] -> abort() inner (fFirst s x) xs

I hope that is fairly easy to read.

Using this implementation of *gfold*, it is easy to write functions that act differently on the first or last item of a list:

let gfoldFirst fZero fFirst fAny s list = gfold fZero fFirst (combineLeft2 fFirst fAny) fFirst fAny fAny s list let gfoldLast fZero fAny fLast s list = gfold fZero fLast (combineLeft2 fAny fLast) fAny fAny fLast s list

If you know that the list has at least one item then you can replace *fZero* with abort.

**Array implementation**

let gfold fZero fOne fTwo fFirst fMiddle fLast s (arr:array<_>) = let arrn = arr.Length if arrn = 0 then fZero s elif arrn = 1 then fOne s arr.[0] elif arrn = 2 then fTwo s arr.[0] arr.[1] else let mutable res = fFirst s arr.[0] for i = 1 to (arrn-2) do res <- fMiddle res arr.[i] fLast res arr.[arrn-1][/sourcecode] The code for arrays is less functional in style (it does not use pattern matching and it uses mutable storage), but that is because indexed access on an array is more efficient. The implementation for <em>gfoldFirst</em> and <em>gfoldLast</em> is identical to the implementation for lists. <strong>Sequence implementation</strong> let gfold fZero fOne fTwo fFirst fMiddle fLast s (sequence:_ seq) = use e = sequence.GetEnumerator() if e.MoveNext() then let x1 = e.Current if e.MoveNext() then let x2 = e.Current if e.MoveNext() then let mutable res = fMiddle (fFirst s x1) x2 let mutable x = e.Current while e.MoveNext() do res <- fMiddle res x x <- e.Current fLast res x else fTwo s x1 x2 else fOne s x1 else fZero s[/sourcecode] This code for sequences is not particularly nice. However, the alternative of using <em>fold</em> with a continuation function and saved values that I attempted convinced me that this was a better approach. If someone has a more elegant solution I would be interested in seeing it. Nevertheless, this implementation is efficient. As with arrays, the implementation of <em>gfoldFirst</em> and <em>gfoldLast</em> is identical to the implementation for lists. <strong>An example</strong> Using the scenario of inserting separators between strings, here is a simple example representing the summation of numbers: printf "%s\n\n" (gfoldLast (fun (t,s) -> t + " No calculation: " + (string s)) (fun (t,s) -> fun x -> t + " " + (string x) + " + [running total: " + (string (s+x)) + "]\n" , s + x) (fun (t,s) -> fun x -> t + " " + (string x) + " = " + (string (s+x))) ("", 0) [1 ; 2 ; 3 ; 4])

The output from this is as follows:

> 1 + [running total: 1] 2 + [running total: 3] 3 + [running total: 6] 4 = 10 val it : unit = ()

*Note: this example is only intended to illustrate a use of gfoldLast*

Whether to use *gfold* or another construct needs a certain amount of judgement as the resulting code may be more complicated. In this case, there are two intermediary state values (the text and the running total) but only a single final result (the text). Using *gfold* directly would allow the initial running total to be hidden but would have required more code because of the extra function arguments. This code would also be difficult using the standard *fold* function because of the need to handle the last case differently. This could be managed using a complex state variable but I think the solution using *gFoldLast* is the easiest to understand.

**Final comments**

To wrap the code into a standard library in my code, I have extended the standard modules in *Microsoft.FSharp.Collections*. This makes the generic fold implementations available in the same way as the standard fold implementations.

As always, comments are welcome.

James MooreWhat about an initial transformation into a sequence of Positions (Beginning,Middle,End,Only) like this:

type Position =

| Beginning of ‘t

| Middle of ‘t

| End of ‘t

| Only of ‘t

let rec sequenceOfPositionsWithLazyList(existing: LazyList) =

seq {

let rec elementsAfterBeginning(s) =

seq {

match s with

| LazyList.Cons(h, LazyList.Nil) -> yield End(h); ()

| LazyList.Cons(h, t) -> yield Middle(h); yield! elementsAfterBeginning(t)

| _ -> ()

}

match existing with

| LazyList.Cons(h, LazyList.Nil) -> yield Only(h); ()

| LazyList.Cons(h, t) -> yield Beginning(h); yield! elementsAfterBeginning(t)

| LazyList.Nil -> ()

}

let sequenceOfPositions(existing: seq) = sequenceOfPositionsWithLazyList(LazyList.of_seq existing)

printfn “%A” (sequenceOfPositions [1;2;3])

printfn “%A” (sequenceOfPositions [1;3])

printfn “%A” (sequenceOfPositions [1])

printfn “%A” (sequenceOfPositions [])

That gets you:

seq [Beginning 1; Middle 2; End 3]

seq [Beginning 1; End 3]

seq [Only 1]

seq []

And I think that makes all of your fold functions much easier to write.

James Moore(Looks like the comment system ate bits of the code – I put it up on my blog: http://jamesmoorecode.blogspot.com/2009/09/sequence-of-positions-in-sequence-f.html)

Steve HorsfieldPost authorThanks James.

I do like your approach. I wasn’t trying to make a single generic approach to the implementation as I was hoping to demonstrate the differences between the different containers (seq, list, array).

So long as you then write “gfold” using the transformed sequence involving Position then all is good. If you want to use fold directly on a Position sequence then you will need to use a match function which means that you are rewriting the same code for each case, each time you use it.

Finally, if you are using Position directly then both Only and End are redundant cases. “Only” is “Start Cons Null” and End is “Middle Cons Null”, similarly (but missing from your example) “Two” is “Start Cons Middle Cons Null”.

Thanks for your comment, I do like your approach of using sequence comprehensions.