Thank you that @bgamari. It's interesting. Since hashing would force the string and violate Haskell semantics, then let's just do binary search on Strings. No issues with semantics there, and we get the O(log(n)) behavior I was after.

@bgamari I am not sure what you mean. There are two suggestions. When compiling a function like `f`

above (or a case expression) either generate code to do binary search on the constant Strings and dispatch that way, OR at compile time hash the constant Strings to integers and then generate code that does binary search of the integers (after hashing the input string) -- finally doing a string eq-comparison once a leaf is reached. How does https://hackage.haskell.org/package/hashable-1.4.0.2/docs/Data-Hashable.html#t:Hashed help me achieve either of those two?

There are workarounds of course. The one we use now is not to do case on Strings, and just build a `Map String Function`

. But the current ghc compilating strategy is supprising and (as far as I know) undocumented -- it's a performance trap.

You mean traversing the list of the input string to compute the hash function would violate the semantics of the language? That may technically be true, but it's really not in the spirit of the case expression which is in essense how the language computes anything (yes I know up to WHNF). And it's a terrible way to live if you write programs like the following:

```
f "abc" = 1
f "def" = 2
f "ghi" = 3
f _ = 4
```

and you are relying on the fact that the compiler when evaluating something `f str`

, will only look at `str`

up to length 3. Anyway, I am not married to this idea -- I was just pointing an alternative compilation strategy. Binary search on String (instead of the hash values) would be fine as well.

@simonpj I don't have a lot of time so it will go slow, but I would like to try to fix this. Please point me to the right place and give as much info as you can on where/how this should be implemented.

Just found out today that ghc has the same issue with literal strings -- which is unfortunate because it is quite common to go from a String to an ADT in haskell code (at least in our code-base it is).

```
data D = A1 | A2 | ... | An
stringToAdt :: String -> D
stringToAdt "a1" = A1
stringToAdt "a2" = A1
...
stringToAdt "an" = An
stringToAdt _ = error "no such adt"
```

and for this we will again get a linear search. Just like for Integers (or anything that has Ord really), we should be doing binary search.

Javac takes another route when it comes to the compiling switch for the String case. They hash all the constant strings to Ints and transform the switch into a switch on Ints -- which is then compiled into binary search on Ints (which presumably will be faster than doing binary search with string comparisons). In haskell this would look like:

```
stringToAdt :: String -> D
stringToAdt str = go (hash str)
where
go :: Int -> D
go INT_HASH_VAL_FOR_a1 | str == "a1" = A1
go INT_HASH_VAL_FOR_a2 | str == "a2" = A2
...
go INT_HASH_VAL_FOR_an | str == "an" = An
go _ = error "no such adt"
```

**Neo**
(41dbf65f)
*
at
08 Oct 21:06
*

Fix indentation.

**Neo**
(d886eced)
*
at
08 Oct 21:04
*

Lots more work on contiguous regions. Lots more tests.

@hsyl20 @bgamari Please take care when transferring the negation from the variable to the constant. Test the case when the constant happens to be INT_MIN. It's negation is itself -- which may be ok but please test it.

gcc does this for the following c program.

```
int __attribute__ ((noinline)) g(int x) {
return INT_MIN * (-x);
}
```

```
g:
.LFB28:
.cfi_startproc
endbr64
movl %edi, %eax
sall $31, %eax
ret
.cfi_endproc
```

I guess because for any other value than 0 or 1 the program is undefined, this expression works for 0 and 1. For 0 we get all bits 0, and for 1 we get 1000000...0000000000 which is INT_MIN where we started from. I am not saying we have to do the same but please test this case.

**Neo**
(d51a61b4)
*
at
23 Sep 09:14
*

Delete unused code, and general cleanup.

**Neo**
(4029f00e)
*
at
22 Sep 09:25
*

New case expression tests.

**Neo**
(401f7c16)
*
at
22 Sep 09:23
*

Better plan for three isolated cases with default.

*
... and
1 more commit
*

**Neo**
(751d5ee2)
*
at
20 Sep 10:37
*

Improve generated code for FourLabels in the presense of default. A...

**Neo**
(d6447ae7)
*
at
18 Sep 07:50
*

Continued development.

**Neo**
(e5a13e4a)
*
at
15 Sep 10:34
*

In functions like the once below the two negation operations can be safely eliminated. The optimization is valid for all integer and floating point types. The same story is of course true for division.

```
f_int_1 :: Int -> Int -> Int
f_int_1 x y = (- x) * (-y)
f_double_1 :: Double -> Double -> Double
f_double_1 x y = (- x) * (- y)
```

Right now we produce:

```
f_int_1
= \ (x_a2tn :: Int) (y_a2to :: Int) ->
case x_a2tn of { GHC.Types.I# x1_i2ve ->
case y_a2to of { GHC.Types.I# x2_X2vE ->
GHC.Types.I#
(GHC.Prim.*#
(GHC.Prim.negateInt# x1_i2ve) (GHC.Prim.negateInt# x2_X2vE))
}
}
f_double_1
= \ (x_a2tq :: Double) (y_a2tr :: Double) ->
case x_a2tq of { GHC.Types.D# x1_a2vt ->
case y_a2tr of { GHC.Types.D# x2_X2vV ->
GHC.Types.D#
(GHC.Prim.*##
(GHC.Prim.negateDouble# x1_a2vt) (GHC.Prim.negateDouble# x2_X2vV))
}
}
```

Moreover for the following two functions we could be (at compile time) removing the negation from the variable and pushing it into the constant:

```
f_int_2 :: Int -> Int
f_int_2 x = 3 * (-x)
f_double_2 :: Double -> Double
f_double_2 x = 3.0 * (-x)
```

we should be transforming these into:

```
f_int_2 :: Int -> Int
f_int_2 x = -3 * x
f_double_2 :: Double -> Double
f_double_2 x = -3.0 * x
```