Jun 26

Advanced Topics

Namespaces and Compilation Units

Namespaces and module names can be specified with an initial declaration at the head of a file, e.g.,

         module MyCompany.MyLibrary.MyModule

The final identifier is treated as the module name, the preceding identifiers as the namespace. If you place your F# code into a namespace you will need to explicitly open the module using its long identifier, for example:

         open MyLibrary.MyModule

By default a module has no namespace and the ‘OCaml rule’ is used for the module name, i.e., the first letter of the filename is capitalized. A default namespace can be specified via the –namespace command-line option. If any module declaration is given it overrides the default namespace.

A namespace declaration may replace the initial module declaration. A namespace can only contain types and modules, but not values, e.g.,

         namespace MyCompany.MyLibrary

         type MyType = class
           ...
         end
         module MyInnerModule = begin
            let myValue = 1
            ...
         end

The namespace Microsoft.FSharp is automatically opened for all F# code. By default the namespace Microsoft.FSharp.MLLib and the module Microsoft.FSharp.MLLib.Pervasives are also automatically opened unless the –no-mllib command-line option is given.

The “.” notation

F# supports a number of variations on the “.” notation, e.g.,

  • Qualified names, e.g., Microsoft.FSharp.MLLib.Set.empty or System.Console.In.
  • Member access expressions, e.g., (System.String.Join(”a”, “b”)).Length
  • Assigning to fields and writing to properties, e.g., form.Title

In general a dot-notation expression can resolve to an F# value, a .NET method group (which will require further overload resolution), a .NET property, a .NET field, an F# record member, a .NET constructor group or a .NET delegate constructor group. The context in which the dot-notation expression occurs will dictate the rules that apply after the resolution of the notation ? for example, a .NET property on the left of an assignment will become a call to a .NET property setter. More examples are given in the section on interoperability.

Only type information prior to the expression (or within the expression itself) will be used when determining the type of this expression, so you may need to place additional type constraints at the point of declaration of the variables involved. (The only exception is when the identifiers on the right of the lookup specify a record field label, and the type of the expression is not yet determined by partial type information. In this case, the type of the expression being accessed is inferred from the type of the record to which the field label belongs)

F# record and class fields may be accessed via the . notation even when the field names have not been brought into the top-level scope via an open. For example, if a module contains

           type recd = { Name: string }

then you may use x.Name if x is determined by partial type inference to be of type recd. Likewise, the record labels involved in a record construction can be determined by type annotations, e.g.

           let x : recd = { Name="3" }

The determination of the types that govern the interpretation of record labels is subject to the same inference rules as .NET member access and many other F# language constructs, i.e., the type of the value to the left of the . must be determined by type inference using annotations and other information available on a left-to-right, outside-to-inside analysis of the program text up to the point where the construct occurs.

When accessing a field, property or method member of a type T1 where the member is declared in some T2 a constraint is asserted that T1 coerces to T2 according to the coercion relation.

Reactive Recursion and Dynamic Initialization Soundness Checks

F# permits the specification of values whose specifications are mutually-referential, but where the mutual-references are hidden inside delayed values such as inner functions, other recursive functions, anonymous fun lambdas, lazy computations, and the’methods of object-implementation expressions. A much broader class of expressions can be used in “let rec” bindings, and in particular the expressions can be applications, method calls, constructor invocations and other computations.

Some such uses of recursion are runtime checked, because there is a possibility that the computations involved in evaluating the bindings may actually take the delayed computations and execute them. The F# compiler gives a warning for “let rec” bindings that may be involve a runtime check and inserts delays and thunks so that if runtime self-reference does occur then an exception will be raised.

The recursion is “reactive” because it is used when defining objects such as forms, controls and services that respond to various inputs. For example, GUI elements that store and retrieve the state of the GUI elements as part of their specification have this behaviour. A simple example is the following menu item, which prints out part of its state when invoked:

   let rec menuItem : MenuItem =
      new MenuItem("&Say Hello",
                   new EventHandler(fun sender e ->
                        Printf.printf "Hello! Text = %sn" menuItem.Text),
                   Shortcut.CtrlH)

A compiler warning is given when compiling this code because in theory the new MenuItem(…) constructor could evalute the callback as part of the construction process. It is a current research topic in the language design community to design type systems to catch these cases.

.NET Dynamic Type Tests

This is described in the section on interoperability.

Object Expressions

An object expression declares an implementation and/or extension of a class or interface. For example, the following uses the functionality of F#’s built-in polymorphic and > operators to construct an object that implements the .NET IComparer interface.

   {new IComparer
         with Compare(a,b) = if a 
              

Typically, F# generates a new class that implements/extends the given type. The required signature for Compare is inferred by looking at the unique virtual method that we must be overriding. In the longer example below note how the anonymous classes may close over free variables (e.g., capture the variable “n”) behind the scenes.

     open System
     open System.Collections
     open System.Windows.Forms
     let myComparer = {new IComparer with Compare(a,b) = compare a b}
     let myFormattable = {new IFormattable with ToString(fmt,fmtProvider) = "“}

     let myForm title n =
       let res =
        {new Form()
         with OnPaint(paintEventArgs) = Printf.printf “OnPaint: n = %dn” n
         and  OnResize(eventArgs) = Printf.printf “OnResize: n = %dn” } in
       res.Text 
              

You may only override/implement methods in anonymous classes. You may not declare fields or new methods. To override/implement virtual properties you should override the “get_PropertyName” and “set_PropertyName” methods that implement the property. Likewise to override virtual events you should override the “add_EventName”, “remove_EventName” and “fire_EventName” methods that implement the event.

Objects created via object expressions can also support multiple interfaces.. The syntax is:

   { new  with 
     interface  with 
     …
     interface  with  }

e.g.,

    { new Object()
        with Finalize() = cleanupFunction(false);

      interface IDisposable
        with Dispose() = cleanupFunction(true);  }

The interfaces are not part of the type of the object expression but can be revealed through type tests.

Accessing “this” in object expressions

F# members bind the name of the this parameter explicitly, see the quick tour.

There are several ways to access this in object expressions if necessary, all of quite similar. Firstly, you can use the reactive recursion feature described above. This feature results in a compiler warning that the initialization guaranteed for the object expression may not be as strong as you wish, and in particular that (in theory) the constructor for the object may invoke a virtual method which uses “this” before the object is initialized. For example

       let rec obj = {new System.Object()
                      with GetHashCode() = (obj.ToString()).Length}

Here the identifier obj plays the role of this. This example makes it plainer why let rec must be used: obj is certainly defined in terms of itself, i.e., its definition is self-referential or recursive.

Note that mutually-self-referential objects can be defined via the same mechanism:

         let rec obj1 = {new System.Object()
                         with GetHashCode() = (obj2.ToString()).Length}

         and     obj2 = {new System.Object()
                         with GetHashCode() = (obj1.ToString()).Length}

Accessing “base” in object expressions

An inescapable part of the design of .NET object-oriented libraries is that they require extensions of some class types to use the implementations of overriden methods as part of the definition of the extension. This is seen in C#’s “base” identifier.

F# permits object expressions that extend classes to be defined partly in terms of the base functionality of that class. This is done be labelling that functionality as part of the object expression:

        {new Form() as base
         with ... }

Here base is not an F# keyword, just as this is not an F# keyword. Any identifier can be used for base, and if object expressions are nested then different identifiers should be used at different points.

In the following example, the variable base has type Form, and can only be used to perform method invocations, e.g., as follows:

     let myForm =
        {new Form() as base
         with OnPaint(args) = base.OnPaint(args);  Printf.printf "OnPaintn" n
         and  OnResize(args) = base.OnResize(args); Printf.printf "OnResizen" n }

Accessing this in constructors

F# members bind the name of the this parameter explicitly, see the quick tour. Within constructors use the following:

   type x =
     class
       new() as x = ...
     end

This will give warnings if x is used in or before the construction expression. A full example:

   type x =
     class
       inherit MyDataBase
       val a : (unit -> string)
       val mutable b : string
       new() as x = { a = (fun () -> x.b) }
     end

Accessing “base” in classes

The overriden and protected functionality of an inherited class can be accessed by naming the inherited object, e.g., as follows:

   type x =
     class
       inherit System.Windows.Forms.Form as base
       ...
     end

Here base can again be any identifier, and can now be used in any non-static member in the implementation of the members of the type.

Initialization Semantics

The initialization semantics for toplevel bindings is as follows (which is more suitable for a multi-language, dynamic loading setting than the usual “evaluate everything” semantics given to ML programs.)

  • .EXEs: The only top-level bindings that are immediately evaluated on startup are those in the last module specified on the command-line when building a .EXE.
  • .DLLs: All bindings in a DLL are executed on demand, at most once each time a module is loaded into a .NET application domain. The execution may be triggered by the access of any of the fields or methods of a module. The granularity is guaranteed to imply that all the top-level bindings in a single F# source file are evaluated sequentially if any are evaluated.

Functions and Signatures

F# makes a deliberate distinction between the following two signatures, for both performance and .NET interoperability reasons

     val f: int -> int

and

     val f: (int -> int)

The parentheses indicate a top-level function which may (or may not) be a first-class computed expression that computes to a function value, rather than a compile-time function value. The first can only be satisfied by a true function, i.e., the implementation must be a lambda value as in

     let f x = x + 1

or

     let f = fun x -> x + 1

The second can be satisfied by any value of the appropriate type, e.g.,

     let f =
        let myTable = Hashtbl.create 4 in
        fun x ->
          match (Hashtbl.tryfind myTable x) with
          | None -> let res = x*x in Hashtbl.add myTable x res; res
          | Some(x) -> x

or

     let f = fun x -> x + 1

or even

     // throw an exception as soon as the module bindings are executed
     let f : int -> int = failwith "failure"

If necessary you can always parenthesize all top-level function types in all your signatures, and you will get Caml-style semantics. You can still use both kinds of functions as first-class function values from other code ? the parentheses simply act as a constraint on the implementation of the value. Both the optimizer and the .NET interoperability mechanisms can take advantage of the arity information provided.

The rationales for this interpretation of top-level types is that achieving good .NET interoperability requires F# functions to be compiled to methods, rather than to fields which are function values. Thus we inevitably need some kind of information in signatures to reveal the desired arity of a method as it is revealed to other .NET programming languages. We could use other annotations, e.g an explicit attribute syntax. However that would mean that these annotations would be required in the normal case where functions are implemented as true methods. F# is thus biased toward a mechanism that achieves high-quality interoperability without requiring a wealth of annotations to do so.

Unsurprisingly, programmers used to other ML languages find this unusual when they first see it described. OCaml programmers may find the analogy with datatype declarations helpful, where the declarations

     type c = C of int * int

and

     type c = C of (int * int)

differ in both OCaml and F# ? the parentheses are used to indicate that the constructor takes on argument that is a first-class pair value. Without the parentheses you can’t write

     match c with
       C p -> ...

but both interoperability and performance is improved if you use the first.

.NET Compatible Arrays when compiling for .NET v1.x

F# has two syntaxes for array types: string array and string[].

  • When compiling for .NET 2.0 and later there is no difference between these, and you can ignore this section completely.
  • When compiling for .NET 1.x there is a difference, which will have an impact on your code whenever you pass arrays to or from other .NET languages.

The rule to remember is If you want your code to compile on both .NET v1.x and v2.0, use types such as string[] and Point[] when you pass array values to or from .NET, and use the operations n Compatibility.CompatArray to manipulate these values. The [] types are guaranteed to be compatibile with the corresponding .NET types under any F# configuration.

For example, replace

       let points = Array.init ...

with

       let points = Compatibility.CompatArray.init ...

Using open Compatibility at the top of the file will help, e.g.,

       open Compatibility
       ...
       let points = CompatArray.init ...

If you find yourself only ever manipulating values using CompatArray then use the following to override the default array lookup and assignment operators.

       let inline (.()) arr n = CompatArray.get arr n
       let inline (.()
              

The details are as follows:

  • All versions of F# support generics (type parameters), even when compiling for .NET v1. F# code is compiled using erasure for .NET v1 and by generating .NET Generics for the .NET v2.0. For the most part you won’t notice any difference at all.
  • However, erasure leads to a problem for arrays. When F# compiles the type Point array, it uses erasure, so this becomes the C# type object[]. This is fine, until you start passing arrays to or from other .NET languages, e.g., when you make a call to C# that expects a Point[]. At this point something compiled as object[] is not good enough: you need an array type that is guaranteed-to-be- interop-compatibile under all F# configurations.
  • As such, F# has two syntaxes for array types: arr : string array and arr : string[].

    For .NET v2.0 and later there is no difference between these, and both are interop-compatible. For .NET v1.0/v1.1 only values of the latter are guaranteed to be compatible with .NET array types. (The former are for writing generic F# code that gets compiled using erasure.)

  • To generate and manipulate guaranteed-to-be-compatible-array-values such as Point[] you must use the operations in the module Microsoft.FSharp.Compatibility.CompatArray.
  • Array expressions [| … |] currently have type ‘a array, e.g., [| 3; 4 |] has type int array. To write guaranteed-to-be-compatible array constants use CompatArray.of_array, e.g., CompatArray.of_array [| 3; 4 |]

It is expected that by essentially all F# code will be compiled assuming .NET v2.0, so this is basically a transitional issue.

Abstraction Warnings

Some experimental F# constructs may not have corresponding constructs in the .NET Common Intermediate Language, and so alternative techniques are used to implement these features. In this case a warning may be given, if these warnings are enabled (e.g. using –all-warnings). For example, object interfaces may be made private in F#. However, when warnings are enabled you may see a message such as

    foo.ml (53:4): This method will be made public in the underlying IL
    because it implements an interface slot or overrides a method.

Warnings related to privacy, abstraction and other conditions also sometimes result from incomplete compiler implementation work (i.e., the construct could in theory be made private, but the work to do that hasn’t yet been completed). Not all such cases are necessarily currently flagged as such by the above compiler option. There may also be other ways in which a theoretical notion of “full abstraction” fails for F# compilation, e.g., it may currently under certain circumstances be possible to use backdoor mechanisms like reflection to access compiler-exposed functionality of F# values and types. When that is possible it is generally considered a bug in the compiler and/or language definition, so should be reported.

Jun 26

The F# Programming Language Informal Specification

Jun 26

F# from C# and other .NET Languages

An Example

In general values can be accessed simply by calling the available values as static methods or using them as static fields.

1. Introduction

F# is a scalable, script-like, type-safe, efficiently executing functional/imperative/object-oriented programming language. It aims to be the premier type-safe symbolic programming language for the .NET platform.

This manual describes the F# language through a mixture of informal and semi-formal techniques.

F# is similar to the Caml programming language family, and was partially inspired by it. See F# and OCaml for more details.

2. Lexical Specification

Comments. Comments are delimited by (* and *) and may be nested. In addition, C#/C++-style // single-line comments may be used and extend to the end of a line.

Strings embedded within comments are parsed without looking for closing *) marks, e.g., (* Here’s a code snippet: let s = “*)” *) is a valid comment.

Conditional Compilation. #if ident/#endif directives delimit conditional compilation sections. Text in a #if section is included in the compilation if the given ident is defined in the compilation environment, typically via the command line option –define. F# also allows code to be cross-compiled as both F# and OCaml code. Sections marked

         (*IF-FSHARP  ... ENDIF-FSHARP*)
     or  (*F#         ... F#*)

are included when compiling with the F# compiler, and text surrounded by

         (*IF-CAML*)  ...  (*ENDIF-CAML*)
      or (*IF-OCAML*)  ... (*ENDIF-OCAML*)

is excluded when compiling with the F# compiler. Of course the converse holds when compiling programs using an OCaml compiler.

Keywords. The keywords of the F# language are shown below.

abstract and as assert begin class default delegate do done downcast downto else end
enum exception extern false finally for fun function if in inherit interface land lazy let
match member  module mutable namespace new null of open or override rec sig static struct then to true try
type val when inline upcast while with void

The following identifiers are also keywords because they are keywords in OCaml. However their use is not recommended in F# code, since symbolic alternatives are available.

asr land lor lsl lsr lxor mod

The symbols act as keywords:

| -> ->> ] [| |] { } #

The following identifiers are used in OCaml and/or C# and are reserved for future use by F#. The –ml-compatibility option permits OCaml identifiers to be used.

async atomic break checked component const constraint
constructor continue decimal eager event
external fixed functor include method mixin object
process property protected public pure readonly return sealed
virtual volatile {}

Operators and Symbolic Keywords. Operator names are sequences of characters as shown below, except where a combination of characters is used as a symbol elsewhere in the language. Precedence is specified below.

    first-op-char := !$%&*+-./?@^|~
    op-char       := first-op-char | :
    operator      := first-op-char op-char*

For example, &&& and ||| are valid operators. These have default definitions as overloaded bitwise manipulations on integer types.

Certain Unicode characters may also be included in the operator set. The final choice of such characters is as yet unspecified, though it may be specified in the change notices with each compiler release.

In addition, the following operators correspond to syntactic forms for expressions.

    .[]       (expression form "a.[i]")
    .[]
                    

The .[] forms have default definitions as overloaded lookup and assignment on arrays, dictionarys, maps and any value supporting an Item property. The .() forms have default definitions as array lookup. In theory the operators can be redefined, e.g., the example below shows how to do this for byte arrays, but this is rarely used in practice:

    let (.[])   s n   = Bytearray.get s n
    let (.[]
                    

For compatibility with OCaml the following identifiers are parsed as infix operators, corresponding to logical bitwise manipulations:

    land lor lsl lsr lxor mod

However the corresponding overloaded symbolic operators (&&& etc.) are often preferred.

The following operators are assigned a meaning via lifted term quotation:

    quotation-operator-left :=
| op-char*
|