F#: A Data Structure For Modelling Directional Graphs

In my previous post, I described how to parse a regular expression pattern into an abstract syntax representation.  The next phase is to transform the parsed syntax into a finite state machine.  I have chosen to model the FSM as a mathematical directional graph (digraph).  This post describes the implementation of this module in preparation for the actual compilation step.

You can read the previous article here: F#: Building a Regular Expression Pattern Parser.

Overview

There are a number of ways in which a digraph can be modelled:

  • As a set of vertices and a set of edges
  • As a set of vertices and a table with positions for every possible edge
  • As a set of tuples comprising a vertex and a set of its edges

Each of these options has both advantages and disadvantages.  For example, it is very easy to find paths using the second approach but it requires space that grows at a rate of n^2 where n is the number of vertices.

I have chosen to use the third of these, often called an adjacency list approach.  I have also augmented the data structure with some other information:

  • integer identifiers to allow the addressing of a vertex or edge by a number
  • integer priorities on edges to allow edges to be sorted according to priority (this will be useful for the FSM)
  • arbitrary data to be associated with both edges and vertices

I have omitted several functions that a typical graph representation may support, such as union and intersection.  This is because I do not think they will be needed for the regular expression compilation.  Instead, I have added methods that support atomic changes to the digraph:

  • obtaining an empty graph
  • adding and removing edges
  • adding and removing vertices

Edges are only valid between existing vertices but the algorithm does not perform any checking.

The code follows.  Again, I have split the code into a module file with a separate signature:

Graph.fsi

#light

module Graph

  (* In this module, a graph is represented by an adjacency
     list.

     This module is not intended to be a high performance
     implementation.

     This module can represent a directed or undirected
     graph depending on the method of handling edges.
     The default behaviour results in a directed graph.
  *)

  type VertexData<'V> =
    int (* identifier *) *
    'V (* vertex data *)

  type EdgeData<'E> =
    int (* identifier *) *
    int (* priority *) *
    int (* vertex target *) *
    'E (* edge data *)

  (* The graph uses adjacency list notation *)
  type Adjacency<'E> = EdgeData<'E> list

  (* The Vertex type represents the internal structure
     of the graph *)
  type Vertex<'V, 'E> = VertexData<'V> * Adjacency<'E>

  (* A Graph is a Vertex list.  The nextNode allows for
     consistent addressing of nodes *)
  type Graph<'V, 'E> =
    int (* nextNode identifier *) *
    Vertex<'V, 'E> list

  (* Empty graph construction *)
  val empty: Graph<_,_>

  (* Helper methods for getting the data from a Vertex *)
  val vertexId : Vertex<_,_> -> int
  val vertexData : Vertex<'V,_> -> 'V
  (* Helper methods for getting the data from an Edge *)
  val edgeId : EdgeData<_> -> int
  val edgePriority : EdgeData<_> -> int
  val edgeTarget : EdgeData<_> -> int
  val edgeData : EdgeData<'E> -> 'E

  (* Getting a vertex from a graph by id *)
  val getVertex : int -> Graph<'V,'E> -> Vertex<'V,'E>
  (* Getting all edges from a graph by a vertex id *)
  val getEdges : int -> Graph<'V,'E> -> Adjacency<'E>

  (* Add a new vertex *)
  val addVertex : 'V -> Graph<'V,'E> -> 
      int (*new id*) * Graph<'V,'E>

  (* Add a new edge.  Edges include a priority value *)
  val addEdge :
    int (*priority*) ->
    int (*source vertex*) ->
    int (*target vertex*) ->
    'E (*edge data*) ->
    Graph<'V,'E> ->
    int (*new id*) * Graph<'V,'E>

  (* The edges aren't sorted by default so this function
     sorts them by priority *)
  val sortEdges : Adjacency<'E> -> Adjacency<'E>

  (* Removes an edge from a graph by id *)
  val removeEdge : int -> Graph<'V,'E> -> Graph<'V,'E>

  (* Removes a vertex from a graph by id and removes
     any related edges *)
  val removeVertex : int -> Graph<'V,'E> -> Graph<'V,'E>

Graph.fs

#light

module Graph
  (* -- Same description as above -- truncated! *)

  type VertexData<'V> =
    int (* identifier *) *
    'V (* vertex data *)

  type EdgeData<'E> =
    int (* identifier *) *
    int (* priority *) *
    int (* vertex target *) *
    'E (* edge data *)

  (* The graph uses adjacency list notation *)
  type Adjacency<'E> = EdgeData<'E> list

  (* The Vertex type represents the internal structure
     of the graph *)
  type Vertex<'V, 'E> = VertexData<'V> * Adjacency<'E>

  (* A Graph is a Vertex list.  The nextNode allows for
     consistent addressing of nodes *)
  type Graph<'V, 'E> =
    int (* nextNode identifier *) *
    Vertex<'V, 'E> list

  (* Empty graph construction *)
  let empty: Graph<_,_> = (0, [])

  (* Helper methods for getting the data from a Vertex *)
  let vertexId (v:Vertex<_,_>) = v |> fst |> fst
  let vertexData (v:Vertex<_,_>) = v |> fst |> snd
  (* Helper methods for getting the data from an Edge *)
  let edgeId ((x,_,_,_):EdgeData<_>) = x
  let edgePriority ((_,x,_,_):EdgeData<_>) = x
  let edgeTarget ((_,_,x,_):EdgeData<_>) = x
  let edgeData ((_,_,_,x):EdgeData<_>) = x

  (* Getting a vertex from a graph by id *)
  let getVertex v (g:Graph<_, _>) : Vertex<_,_> =
    snd g |> List.find (fun V -> vertexId V = v)
  (* Getting all edges from a graph by a vertex id *)
  let getEdges v (g:Graph<_, _>) =
    g |> getVertex v |> snd

  (* Add a new vertex *)
  let addVertex (v:'V) (g:Graph<'V, _>)
    : (int*Graph<'V,_>) =
      let id = fst g
      let s = snd g
      let newVD : VertexData<_> = (id, v)
      let newA : Adjacency<_> = []
      let newV = (newVD, newA)
      (id, (id + 1, newV::s))

  (* Add a new edge.  Edges include a priority value *)
  let addEdge priority
    (v:int) (v':int) (e:'E) (g:Graph<'V, 'E>)
    : (int*Graph<'V,'E>) =
      let id = fst g
      let s = snd g
      let newE : EdgeData<_> = (id, priority, v', e)
      (id,
        (id + 1,
          s |> List.map (fun V ->
            if (vertexId V) = v then
              (fst V, newE::(snd V))
            else V)))

  (* The edges aren't sorted by default so this function
     sorts them by priority *)
  let sortEdges (a:Adjacency<_>) =
    a |> List.sortBy edgePriority

  (* Removes an edge from a graph by id *)
  let removeEdge (id:int) (g:Graph<_,_>)
    : Graph<_,_> =
      let next = fst g
      let s = snd g
      (next, s |> List.map ( fun (v, a) ->
        (v, a |> 
            List.filter (fun x -> (edgeId x) <> id)))) 

  (* Removes a vertex from a graph by id and removes
     any related edges *)
  let removeVertex (id:int) (g:Graph<_,_>) 
    : Graph<_,_> =
      let next = fst g
      let s = snd g
      (next, s |> ([] |> List.fold (fun s' (v, a) ->
        if (fst v) = id then s'
        else
          let f = fun x -> ((edgeTarget x) <> id)
          let newA = a |> List.filter f
          let newV = (v, newA)
          newV::s')))

An example

The module is simple to use. The main trick is to identify the type of the empty graph. This is not necessary if it can be inferred but it is helpful when beginning to create the graph. Here is an example:

#light
open Graph

Graph.empty
  |> Graph.addVertex "vertex1" 
  |> (fun (v1,g) -> 
    (Graph.addVertex "vertex2" g)
    |> (fun (v2,g) -> 
      (Graph.addEdge 0 v1 v2 "edge1" g)))
  |> snd
  |> printf "Parsed graph: %A\n\n"

I have used nested lambda functions to bind the returned variables. You might notice that this is very similar to some of the computational expressions that F# can be used for, such as with Async. I might have a look into doing something like the following to hide the intermediate graph values:

let (g,v1,v2,e1) = graph {
    let! v1 = Graph.addVertex "vertex1"
    let! v2 = Graph.addVertex "vertex2"
    let! e1 = Graph.addEdge 0 v1 v2 "edge1"
    return! (v1,v2,e1)
  }
Advertisements

2 thoughts on “F#: A Data Structure For Modelling Directional Graphs

  1. Pingback: F#: Graphing with GLEE « Steve Horsfield

  2. Pingback: Rick Minerich's Development Wonderland : Discoveries This Week 08/02/2009

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s