Jun 26

A Sample of the Memoization Pattern in F#

Pieter Breed has been asking some useful questions on the F# list recently about using the “Map” type. I really appreciate it when people take the time to let us know when data structures are tricky to learn to use in F#.

Looking at Pieter’s blog made me think it might be instructive to show how the memoization pattern looks in F# (or at least one example of the pattern).

Jun 26

Libraries

The source code for both fslib.dll and mllib.dll is included as sample code in the release. All F# programs implicitly reference both mllib.dll and fslib.dll as well as the following .NET assemblies, where available:

  • mscorlib.dll
  • System.dll
  • System.Xml.dll
  • System.Runtime.Remoting.dll
  • System.Data.dll
  • System.Drawing.dll
  • System.Web.dll
  • System.Web.Services.dll
  • System.Windows.Forms.dll

Some of the namespaces, modules and types in the F# library are:

This namespaces Microsoft.FSharp.Core, Microsoft.FSharp.Collections, Microsoft.FSharp.Control and Microsoft.FSharp.Text are automatically opened for every source code file, as is the module Microsoft.FSharp.Core.Operators. This means modules such as Microsoft.FSharp.Collections.List can typically be opened just by using open List.

Some modules are primarily of interest for those cross-compiling OCaml code, e.g. those under the Microsoft.FSharp.Compatibility.OCaml namespace.

 
Jun 26

Namespaces and Compilation Units

Namespaces and Compilation Units

Namespaces and module names can be specified with an initial declaration at the head of a file, e.g.,

         module MyCompany.MyLibrary.MyModule

The final identifier is treated as the module name, the preceding identifiers as the namespace. If you place your F# code into a namespace you will need to explicitly open the module using its long identifier, for example:

         open MyLibrary.MyModule

By default a module has no namespace and the ‘OCaml rule’ is used for the module name, i.e., the first letter of the filename is capitalized. A default namespace can be specified via the –namespace command-line option. If any module declaration is given it overrides the default namespace.

A namespace declaration may replace the initial module declaration. A namespace can only contain types and modules, but not values, e.g.,

         namespace MyCompany.MyLibrary

         type MyType = class
           ...
         end
         module MyInnerModule = begin
            let myValue = 1
            ...
         end

The namespace Microsoft.FSharp is automatically opened for all F# code. By default the namespace Microsoft.FSharp.MLLib and the module Microsoft.FSharp.MLLib.Pervasives are also automatically opened unless the –no-mllib command-line option is given.

The “.” notation

F# supports a number of variations on the “.” notation, e.g.,

  • Qualified names, e.g., Microsoft.FSharp.MLLib.Set.empty or System.Console.In.
  • Member access expressions, e.g., (System.String.Join(”a”, “b”)).Length
  • Assigning to fields and writing to properties, e.g., form.Title

In general a dot-notation expression can resolve to an F# value, a .NET method group (which will require further overload resolution), a .NET property, a .NET field, an F# record member, a .NET constructor group or a .NET delegate constructor group. The context in which the dot-notation expression occurs will dictate the rules that apply after the resolution of the notation ? for example, a .NET property on the left of an assignment will become a call to a .NET property setter. More examples are given in the section on interoperability.

Only type information prior to the expression (or within the expression itself) will be used when determining the type of this expression, so you may need to place additional type constraints at the point of declaration of the variables involved. (The only exception is when the identifiers on the right of the lookup specify a record field label, and the type of the expression is not yet determined by partial type information. In this case, the type of the expression being accessed is inferred from the type of the record to which the field label belongs)

F# record and class fields may be accessed via the . notation even when the field names have not been brought into the top-level scope via an open. For example, if a module contains

           type recd = { Name: string }

then you may use x.Name if x is determined by partial type inference to be of type recd. Likewise, the record labels involved in a record construction can be determined by type annotations, e.g.

           let x : recd = { Name="3" }

The determination of the types that govern the interpretation of record labels is subject to the same inference rules as .NET member access and many other F# language constructs, i.e., the type of the value to the left of the . must be determined by type inference using annotations and other information available on a left-to-right, outside-to-inside analysis of the program text up to the point where the construct occurs.

When accessing a field, property or method member of a type T1 where the member is declared in some T2 a constraint is asserted that T1 coerces to T2 according to the coercion relation.

Reactive Recursion and Dynamic Initialization Soundness Checks

F# permits the specification of values whose specifications are mutually-referential, but where the mutual-references are hidden inside delayed values such as inner functions, other recursive functions, anonymous fun lambdas, lazy computations, and the’methods of object-implementation expressions. A much broader class of expressions can be used in “let rec” bindings, and in particular the expressions can be applications, method calls, constructor invocations and other computations.

Some such uses of recursion are runtime checked, because there is a possibility that the computations involved in evaluating the bindings may actually take the delayed computations and execute them. The F# compiler gives a warning for “let rec” bindings that may be involve a runtime check and inserts delays and thunks so that if runtime self-reference does occur then an exception will be raised.

The recursion is “reactive” because it is used when defining objects such as forms, controls and services that respond to various inputs. For example, GUI elements that store and retrieve the state of the GUI elements as part of their specification have this behaviour. A simple example is the following menu item, which prints out part of its state when invoked:

   let rec menuItem : MenuItem =
      new MenuItem("&Say Hello",
                   new EventHandler(fun sender e ->
                        Printf.printf "Hello! Text = %sn" menuItem.Text),
                   Shortcut.CtrlH)

A compiler warning is given when compiling this code because in theory the new MenuItem(…) constructor could evalute the callback as part of the construction process. It is a current research topic in the language design community to design type systems to catch these cases.

.NET Dynamic Type Tests

This is described in the section on interoperability.

Object Expressions

An object expression declares an implementation and/or extension of a class or interface. For example, the following uses the functionality of F#’s built-in polymorphic and > operators to construct an object that implements the .NET IComparer interface.

   {new IComparer
         with Compare(a,b) = if a 
              

Typically, F# generates a new class that implements/extends the given type. The required signature for Compare is inferred by looking at the unique virtual method that we must be overriding. In the longer example below note how the anonymous classes may close over free variables (e.g., capture the variable “n”) behind the scenes.

     open System
     open System.Collections
     open System.Windows.Forms
     let myComparer = {new IComparer with Compare(a,b) = compare a b}
     let myFormattable = {new IFormattable with ToString(fmt,fmtProvider) = "“}

     let myForm title n =
       let res =
        {new Form()
         with OnPaint(paintEventArgs) = Printf.printf “OnPaint: n = %dn” n
         and  OnResize(eventArgs) = Printf.printf “OnResize: n = %dn” } in
       res.Text 
              

You may only override/implement methods in anonymous classes. You may not declare fields or new methods. To override/implement virtual properties you should override the “get_PropertyName” and “set_PropertyName” methods that implement the property. Likewise to override virtual events you should override the “add_EventName”, “remove_EventName” and “fire_EventName” methods that implement the event.

Objects created via object expressions can also support multiple interfaces.. The syntax is:

   { new  with 
     interface  with 
     …
     interface  with  }

e.g.,

    { new Object()
        with Finalize() = cleanupFunction(false);

      interface IDisposable
        with Dispose() = cleanupFunction(true);  }

The interfaces are not part of the type of the object expression but can be revealed through type tests.

Accessing “this” in object expressions

F# members bind the name of the this parameter explicitly, see the quick tour.

There are several ways to access this in object expressions if necessary, all of quite similar. Firstly, you can use the reactive recursion feature described above. This feature results in a compiler warning that the initialization guaranteed for the object expression may not be as strong as you wish, and in particular that (in theory) the constructor for the object may invoke a virtual method which uses “this” before the object is initialized. For example

       let rec obj = {new System.Object()
                      with GetHashCode() = (obj.ToString()).Length}

Here the identifier obj plays the role of this. This example makes it plainer why let rec must be used: obj is certainly defined in terms of itself, i.e., its definition is self-referential or recursive.

Note that mutually-self-referential objects can be defined via the same mechanism:

         let rec obj1 = {new System.Object()
                         with GetHashCode() = (obj2.ToString()).Length}

         and     obj2 = {new System.Object()
                         with GetHashCode() = (obj1.ToString()).Length}

Accessing “base” in object expressions

An inescapable part of the design of .NET object-oriented libraries is that they require extensions of some class types to use the implementations of overriden methods as part of the definition of the extension. This is seen in C#’s “base” identifier.

F# permits object expressions that extend classes to be defined partly in terms of the base functionality of that class. This is done be labelling that functionality as part of the object expression:

        {new Form() as base
         with ... }

Here base is not an F# keyword, just as this is not an F# keyword. Any identifier can be used for base, and if object expressions are nested then different identifiers should be used at different points.

In the following example, the variable base has type Form, and can only be used to perform method invocations, e.g., as follows:

     let myForm =
        {new Form() as base
         with OnPaint(args) = base.OnPaint(args);  Printf.printf "OnPaintn" n
         and  OnResize(args) = base.OnResize(args); Printf.printf "OnResizen" n }

Accessing this in constructors

F# members bind the name of the this parameter explicitly, see the quick tour. Within constructors use the following:

   type x =
     class
       new() as x = ...
     end

This will give warnings if x is used in or before the construction expression. A full example:

   type x =
     class
       inherit MyDataBase
       val a : (unit -> string)
       val mutable b : string
       new() as x = { a = (fun () -> x.b) }
     end

Accessing “base” in classes

The overriden and protected functionality of an inherited class can be accessed by naming the inherited object, e.g., as follows:

   type x =
     class
       inherit System.Windows.Forms.Form as base
       ...
     end

Here base can again be any identifier, and can now be used in any non-static member in the implementation of the members of the type.

Initialization Semantics

The initialization semantics for toplevel bindings is as follows (which is more suitable for a multi-language, dynamic loading setting than the usual “evaluate everything” semantics given to ML programs.)

  • .EXEs: The only top-level bindings that are immediately evaluated on startup are those in the last module specified on the command-line when building a .EXE.
  • .DLLs: All bindings in a DLL are executed on demand, at most once each time a module is loaded into a .NET application domain. The execution may be triggered by the access of any of the fields or methods of a module. The granularity is guaranteed to imply that all the top-level bindings in a single F# source file are evaluated sequentially if any are evaluated.

Functions and Signatures

F# makes a deliberate distinction between the following two signatures, for both performance and .NET interoperability reasons

     val f: int -> int

and

     val f: (int -> int)

The parentheses indicate a top-level function which may (or may not) be a first-class computed expression that computes to a function value, rather than a compile-time function value. The first can only be satisfied by a true function, i.e., the implementation must be a lambda value as in

     let f x = x + 1

or

     let f = fun x -> x + 1

The second can be satisfied by any value of the appropriate type, e.g.,

     let f =
        let myTable = Hashtbl.create 4 in
        fun x ->
          match (Hashtbl.tryfind myTable x) with
          | None -> let res = x*x in Hashtbl.add myTable x res; res
          | Some(x) -> x

or

     let f = fun x -> x + 1

or even

     // throw an exception as soon as the module bindings are executed
     let f : int -> int = failwith "failure"

If necessary you can always parenthesize all top-level function types in all your signatures, and you will get Caml-style semantics. You can still use both kinds of functions as first-class function values from other code ? the parentheses simply act as a constraint on the implementation of the value. Both the optimizer and the .NET interoperability mechanisms can take advantage of the arity information provided.

The rationales for this interpretation of top-level types is that achieving good .NET interoperability requires F# functions to be compiled to methods, rather than to fields which are function values. Thus we inevitably need some kind of information in signatures to reveal the desired arity of a method as it is revealed to other .NET programming languages. We could use other annotations, e.g an explicit attribute syntax. However that would mean that these annotations would be required in the normal case where functions are implemented as true methods. F# is thus biased toward a mechanism that achieves high-quality interoperability without requiring a wealth of annotations to do so.

Unsurprisingly, programmers used to other ML languages find this unusual when they first see it described. OCaml programmers may find the analogy with datatype declarations helpful, where the declarations

     type c = C of int * int

and

     type c = C of (int * int)

differ in both OCaml and F# ? the parentheses are used to indicate that the constructor takes on argument that is a first-class pair value. Without the parentheses you can’t write

     match c with
       C p -> ...

but both interoperability and performance is improved if you use the first.

.NET Compatible Arrays when compiling for .NET v1.x

F# has two syntaxes for array types: string array and string[].

  • When compiling for .NET 2.0 and later there is no difference between these, and you can ignore this section completely.
  • When compiling for .NET 1.x there is a difference, which will have an impact on your code whenever you pass arrays to or from other .NET languages.

The rule to remember is If you want your code to compile on both .NET v1.x and v2.0, use types such as string[] and Point[] when you pass array values to or from .NET, and use the operations n Compatibility.CompatArray to manipulate these values. The [] types are guaranteed to be compatibile with the corresponding .NET types under any F# configuration.

For example, replace

       let points = Array.init ...

with

       let points = Compatibility.CompatArray.init ...

Using open Compatibility at the top of the file will help, e.g.,

       open Compatibility
       ...
       let points = CompatArray.init ...

If you find yourself only ever manipulating values using CompatArray then use the following to override the default array lookup and assignment operators.

       let inline (.()) arr n = CompatArray.get arr n
       let inline (.()
              

The details are as follows:

  • All versions of F# support generics (type parameters), even when compiling for .NET v1. F# code is compiled using erasure for .NET v1 and by generating .NET Generics for the .NET v2.0. For the most part you won’t notice any difference at all.
  • However, erasure leads to a problem for arrays. When F# compiles the type Point array, it uses erasure, so this becomes the C# type object[]. This is fine, until you start passing arrays to or from other .NET languages, e.g., when you make a call to C# that expects a Point[]. At this point something compiled as object[] is not good enough: you need an array type that is guaranteed-to-be- interop-compatibile under all F# configurations.
  • As such, F# has two syntaxes for array types: arr : string array and arr : string[].

    For .NET v2.0 and later there is no difference between these, and both are interop-compatible. For .NET v1.0/v1.1 only values of the latter are guaranteed to be compatible with .NET array types. (The former are for writing generic F# code that gets compiled using erasure.)

  • To generate and manipulate guaranteed-to-be-compatible-array-values such as Point[] you must use the operations in the module Microsoft.FSharp.Compatibility.CompatArray.
  • Array expressions [| … |] currently have type ‘a array, e.g., [| 3; 4 |] has type int array. To write guaranteed-to-be-compatible array constants use CompatArray.of_array, e.g., CompatArray.of_array [| 3; 4 |]

It is expected that by essentially all F# code will be compiled assuming .NET v2.0, so this is basically a transitional issue.

Abstraction Warnings

Some experimental F# constructs may not have corresponding constructs in the .NET Common Intermediate Language, and so alternative techniques are used to implement these features. In this case a warning may be given, if these warnings are enabled (e.g. using –all-warnings). For example, object interfaces may be made private in F#. However, when warnings are enabled you may see a message such as

    foo.ml (53:4): This method will be made public in the underlying IL
    because it implements an interface slot or overrides a method.

Warnings related to privacy, abstraction and other conditions also sometimes result from incomplete compiler implementation work (i.e., the construct could in theory be made private, but the work to do that hasn’t yet been completed). Not all such cases are necessarily currently flagged as such by the above compiler option. There may also be other ways in which a theoretical notion of “full abstraction” fails for F# compilation, e.g., it may currently under certain circumstances be possible to use backdoor mechanisms like reflection to access compiler-exposed functionality of F# values and types. When that is possible it is generally considered a bug in the compiler and/or language definition, so should be reported.