Thursday, April 29, 2010

Glorious Hack: pretty-mode for Emacs

Today I was lamenting the lack of unicode identifiers and strings in Standard ML.  I asked on the #emacs irc channel how hard it would be to display the string "alpha" as the glyph α, for example.  User `jlf' pointed me to pretty-mode for emacs from Arthur Danskin, a student at Johns Hopkins:

www.emacswiki.org/emacs/pretty-mode.el

This is a beautiful hack.  It uses font-lock mode to render strings as other strings rather than just coloring the text in the buffer.  It is easy to add your own unicode characters to display, and the code will compile because only the view is modified.  It's possible to use two different codes for α, one for source code identifiers that expands to a legal identifier in the language and another that expands to a unicode sequence that will print as the unicode character.  This is handy for a language like Standard ML (MLton) where unicode is allowed in neither identifiers nor strings.  For instance


val forall =  "\226\136\128"

prints as

val ∀ = "∀"

Sand art

theswedishbed.wordpress.com/2010/04/27/sand-art-a-new-kind-of-land-art/

Wednesday, April 28, 2010

Lazy evaluation for tree replacement in Haskell, OCaml, and SML

(The Haskell code in this post comes from Julia Lawall's paper "Implementing circularity using partial evaluation".  Note also that this post has been significantly rewritten thanks to Jake Donham, a former colleague at both NYU and CMU, who pointed out I didn't know what I was talking about.)

An interesting application of lazy evaluation is to replace all the elements of an int tree by the smallest element without traversing the tree twice.  The pretty Haskell program that does this is:


rm t = fst p
  where p = repmin t (snd p)
        repmin (Tip n) m = (Tip m, n)
        repmin (Fork l r) m = (Fork t1 t2, min m1 m2)
          where (t1, m1) = repmin l m 
                (t2, m2) = repmin r m


We can see this is correct by looking at an invariant of repmin.  Repmin always returns the smallest element of the tree in the second element of the pair, and a tree isomorphic to the first argument filled in with the second argument.  Laziness allows us to provide the minimal value after we've traversed the tree.  Looking at the tree will then give you the correct result.  

In this case laziness provides an elegant solution which would be more cumbersome in SML or Ocaml.  I wondered how much more cumbersome.  I find it difficult to understand the space usage of my Haskell programs, and despite its incredible elegance and advanced features, I prefer Standard ML when my code has to be fast.   I implemented near identical copies of my theorem prover for propositional intuitionistic logic in Haskell and SML.  The Haskell version uses 10X the memory.  Of course, there is a high probability that this is because I'm not a very good Haskell programmer.  Yet the techniques used by experts to achieve good performance can make beautiful code ugly.  I don't think I am alone in this view.  For instance, the Haskell examples at the Programming Languages Shootout are written by experts such as one of the authors of Real World Haskell.  For instance


The programs are laden with strictness annotations and "bang patterns" which basically force the evaluation at the pattern match site.  This suggests to me that often laziness is not the ideal default mode of use.  Here at CMU we teach that laziness is a mode of use of strictness by implementing laziness using closures and references (see below).  However, the example above is beautiful and compelling.  How ugly does it become in strict languages.  Below are some experiments.  If you can do better, I'd be delighted to see your examples.

OCaml

OCaml has some syntactic support for lazy evaluation.  (Note the use of the special syntax `lazy'.)

open Lazy

type 'a tree = Tip of 'a | Fork of 'a tree * 'a tree

let snd (_, y) = y

let rec tmap f t = match t with
  | Tip a -> Tip (f a)
  | Fork (l, r) -> Fork (tmap f l, tmap f r)

let fmap f x = lazy (f (force x))

(* Without type annotations *)
let fm t = 
  let rec repmin t m = 
    match t with
    | Tip n -> (print_int n; print_newline(); (Tip m, n))
    | Fork(l, r) ->
        let (t1, m1) = repmin l m in
        let (t2, m2) = repmin r m in
          (Fork (t1, t2), min m1 m2) in
  let rec p = lazy (repmin t (fmap snd p)) in
    fst (Lazy.force p)

(* With type annotations *)

let fm : int tree -> int t tree =
 fun t ->
    let rec repmin : int tree -> int t -> int t tree * int =
      fun t m -> match t with
        | Tip n -> (print_int n; print_newline(); (Tip m, n))
        | Fork(l, r) ->
            let (t1, m1) = repmin l m in
            let (t2, m2) = repmin r m in
              (Fork (t1, t2), min m1 m2) in
    let rec p = lazy (repmin t (fmap snd p)) in
      fst (Lazy.force p)

let _ = tmap force (fm (Fork(Tip 5, Tip 1)));;

What I like about this code, especially the version with type annotations, is that it is clear that
the second element of the pair returned is strict.   The first version is closer to the Haskell implementation.  It's not so bad except for the need for a special application (`fmap') and the explicit argument to `p'.  

Standard ML

To implement laziness, we need to do it ourselves in SML.  (The SML/NJ compiler has something similar to OCaml's Lazy module, though since my primary compiler is MLton I decided to write it from scratch.)  Update.  This problem is harder than I imagined in the first post.  It seems difficult to solve this problem in SML without resorting to effects apart from the lazy evaluation effects; i.e. making a tree
whose leaves are a ref cell.  There is a rather easy solution using continuations to rebuild the tree once you know the smallest element, but this also traverses the tree twice.  The following solution doesn't work.  


signature LAZY = 
sig
  type 'a t
  val inject_val : 'a -> 'a t
  val force : 'a t -> 'a
  val make : (unit -> 'a) -> 'a t
  val fmap : ('a -> 'b) -> 'a t -> 'b t
  val fmap2 : ('a * 'b -> 'c) -> 'a t * 'b t -> 'c t
  val concat : 'a t t -> 'a t
  val fix : ('a t -> 'a t) -> 'a t
end 

structure Lazy' :> LAZY =
struct 
  datatype 'a laz = Forced of 'a
                  | Thunk of unit -> 'a
  type 'a t = 'a laz ref
  fun inject_val x = ref (Forced x)
  fun force (ref (Forced x)) = x
    | force (l as (ref (Thunk f))) = 
      let
         val x = f ()
      in 
         l := Forced x
       ; x
      end
  fun make f = ref (Thunk f) 
  fun fmap f x = ref (Thunk (fn () => f (force x)))
  fun fmap2 f (x, y) = ref (Thunk (fn () => f (force x, force y)))              
  fun concat x = ref (Thunk (fn () => force (force x)))
  fun fix f = ref (Thunk (fn () => force (f (fix f))))
end

structure L = Lazy'

datatype 'a Tree = Tip of 'a | Fork of 'a Tree * 'a Tree

fun fst (x, y) = x
fun snd (x, y) = y

val fm : int Tree -> int L.t Tree =
 fn t =>
    let
       val rec repmin : int Tree -> int L.t -> (int L.t Tree * int) =
        fn Tip n => (fn m => (print (Int.toString n ^ "\n"); (Tip m, n)))
         | Fork(l, r) => (fn m =>
           let
              val (t1, m1) = repmin l m
              val (t2, m2) = repmin r m
           in
              (Fork(t1, t2), Int.min (m1, m2))
           end)
       val rec p : unit -> int L.t Tree * int =
        fn () => repmin t (L.fmap snd (L.make p))
    in
       fst (p ())
    end

The solution, again due to Jake, is to make a more clever fixpoint operator that makes sure it only is forced once:

signature LAZY = 
sig
  ...
  val fix' : ('a t -> 'a) -> 'a t
end 


structure Lazy' :> LAZY =
struct 
  ...
  fun fix' f =
      let 
        val t = ref (Thunk (fn () => raise (Fail "")))
      in
        t := Thunk (fn () => f t)
      ; t
      end
end


val fm : int Tree -> int L.t Tree =
 fn t =>
    let
       ...
       val p : (int L.t Tree * int) L.t = L.fix' (fn p => repmin t (L.fmap snd p))
    in
      fst (L.force p)
    end




The Economic Naturalist I: A freeway for tourists.

Robert Frank, an economics professor at Cornell, wrote an excellent popular economics book called The Economic Naturalist.  In it he laments the myriad concepts taught in Economics 101, where students are overwhelmed with definitions and facts, yet six months after the end of the course can not solve basic problems in economics.  He fights this trend by asking students to search for counterintuitive economic phenomena in daily life and asks them to describe the phenomena and posit an explanation using fundamental economics principles.  The results are not only entertaining but instructive to the lay person.
Since I am a lay person, I figured I'd try my hand at such an observation.

Instructions:

Your space limit is five hundred words.  Many excellent papers are significantly shorter than that.  Please do not lard your essay with complex terminololgy.  Imagine yourself talking to a relative who has never had a course in economics.  The best papers are the ones that would be clearly intelligible to such a person, and typically these papers do not use any algebra or graphs.

Disclaimer: All comments in the following analysis have a high probability of being at least one of a) uninformed b) unintentionally misleading c) conflate unrelated ideas d) naive e) just false.

A Freeway in Mexico

I noticed a strange phenomenon when I was last in Mexico.  Like many tourists, I visited the stunning Mayan pyramid at Chichen Itza.  I arrived from Tulum, on a small, crowded two lane road.  The trip took approximately 3 hours.  On the way home, I noticed that there was a highway directly back to Cancun, my next destination.  The highway was impressive.  It consisted of between 4 and 6 lanes, and was in better condition than most roads with which I am familiar in the US.  More surprising was that I was nearly the only car on the road.  There were occasional tour busses and other tourists, but Mexican citizens were conspicuously absent.  After the hour and a half it took to reach Cancun, I realized why.  It turned out to be a toll road, and the toll was around $20US.  This angered me at first.  Why should such a resource be used so wastefully, only transporting foreigners able to afford the large toll?  The road could have supported many more drivers with no inconvenience to me.   Clearly an example of an inefficient market.

It was surprising at first, but isn't so much any more.  The function between the cost of the toll and the profit to the private company that ran the road may be maximized at such a high price; lowering the price enough to allow citizens to use the road may generate more fees, but also more road damage.  Moreover, the citizens may be so poor that the company would need to lower the fees so much that even the profits were smaller, regardless of the road damage.  The two-tiered society, say wealthy foreigners and citizens, makes Mexico worse off.  If the discrepancy were smaller, the company could charge less, make as much or more profit, and the economy of Mexico would be considerably strengthened.  Nationalization of the road doesn't seem to solve the problem.  Even if the road became state owned and the charges were lowered, the government would then need to repair the road out of the national budget.  A number of things could then happen.  Either the economy would generate enough new taxes to compensate for the necessary road repairs and Mexico would be better off.  Or it wouldn't and Mexico would be worse off.  The government would have the responsibility for maintaining the road, which would either fall into disrepair or the money would have to be channeled from somewhere else in the budget.  In the short run the people would be better off, but if the road fell into disrepair that improvement would slowly decrease.

One motivation for this sort of road policy may be that it is cheaper to have someone build you a road than to finance the construction yourself.  Perhaps Mexico gave the company an incentive to build the road by giving them toll rights for, say 20 years.  No matter how inequitably they manage the road, at year 20 Mexico has a first-rate road for a modest cost.  This happens in the US.  The city of Pittsburgh, where I live, announced their intention to lease the central turnpike to a company from Spain for a large upfront fee.  This fee would cover some of the state budget deficit, giving the future toll yields to the company for some period of time.

What struck me is that where in a relatively equal society everyone would benefit from a new road, in a highly unequal society, only the wealthy benefit in the short run.

Reversing perspective

3D -> 2D Art

Tuesday, April 27, 2010

GHCi 6.12 unicode emacs problem

I recently installed the Haskell Platform with GHC version 6.12.1 on my Mac.  The primary problem for me has been getting unicode to work correctly.  I was using the utf8-string package, but now it is no longer necessary.  For instance,

main = putStrLn "π x. p(x)"

works correctly when compiled.  However, I had trouble getting it to work via the emacs interactive buffer.  I found that when creating a ghci buffer with `inferior-haskell-load-file'  my bash shell variables, in particular LC_CTYPE, were not getting set.  I thus created a new program that first sets LC_CTYPE and then calls ghci.  This gets the unicode to print correctly in the inferior buffer.

#!/bin/bash

export LC_CTYPE="en_US.UTF-8"
ghci

I then customize `haskell-program-name' to be the name of this file.  If someone knows a better way to do this, please let me know.

Worth Reading, 4/27

Captives: What really happened during the Israli attacks?,  From The New Yorker 11/9/09

Monday, April 26, 2010

Everybody Draw Mohammed Day

After the odious South Park censorship by Comedy Central, (Ross Douthat wrote a probably accurate, if dramatic, analysis of the mess) an artist in Seattle drew a poster announcing the fictional group Citizens Against Citizens Against Humor that are holding the (formerly) fictional Everybody Draw Mohammed Day.

It caught on with a Facebook page, which quickly exploded with both righteous outrage against threats of violence over free speech, and blatant racism.  While the mean content of the myriad comments is rather low, the best are spot on.


This is not about hate. This is about intimidation. This is not only about Matt and Trey. This is about Salman Rushdie. This is about Kurt Westergaard. This is about Ayaan Hirsi Ali. This is about Sooreh Hera. This is about Theo Van Gogh.
I'm currently a hesitant supporter or the campaign.  Images of the prophet are offensive to Muslims.  I have no desire to offend my Muslim friends and colleagues.  There are other ways to address the actions and rhetoric of the small minority of fundamentalist bullies.  However, in this case I believe that personal action is required.   It is not fair that the writers of South Park be the targets of violence while I am entertained yet share none of their hazard.  Of course, they are rewarded handsomely for their effort, but I am the indirect beneficiary of their exercise of my rights of freedom of speech and expression.  


This collective action is a way to diffuse targets for violence.  Since there are currently only a handful of entertainers willing to take the risk of offending the radicals, they make easy targets for terrorists.  It is unfair that such a burden fall only upon those brave enough to risk violence for free expression.  If we as citizens care about the same ideals, and enjoy the fruits of an uncensored, free society, we should accept some part of the risk.  While Comedy Central has not acted admirably, we can not expect such corporations to take stands against their own financial interest.  If we care about and benefit from freedom of expression and freedom from fear, we should stand up.