Is MatchError becoming Scala's NullPointerException?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Is MatchError becoming Scala's NullPointerException?

Aaron Novstrup
I realize the subject line is somewhat provocative, but I think this is
a serious issue that could be addressed with better language/compiler
support. I've noticed that the uses for partial functions and pattern
matching have expanded beyond match blocks, creating a potential for
unexpected MatchErrors or silent failures.

Consider the following code:

    val l = List(1 -> 2, 3 -> 4)
    val keys = for ((k,v) <- l) yield k

Suppose some absent-minded developer comes along, intending to add a
pair (5 -> 6):

    val l = List(1 -> 2, 3 -> 4, 5)         // oops, forgot the " -> 6"
    val keys = for ((k, v) <- l) yield k    // 5 is silently ignored

I could have used map:

    val keys = l map { case (k, v) => k }   // throws MatchError!

Surprisingly, the same problem arises if I attempt to use destructuring:

    val keys = l map { tuple => val (k, v) = tuple; k }  // throws?!?

I think the fundamental issue here is that destructuring is
unnecessarily tied to partial functions and is therefore needlessly
unsafe.  Would it be possible to define a syntax to distinguish safe and
unsafe destructuring?  For example,

Safe (generate compile error if types don't match):
    for ((k, v) <- l) yield k
    val (k, v) = tuple
    l map { ((k, v)) => k }

Unsafe:
    for (case (k, v) <- l) yield k    // filter out mismatches
    val case (k, v) = anything        // throw on mismatch
    l map { case (k, v) => k }        // throw on mismatch

The "safe" versions would expect an unapply method that returns Some (or
just a tuple -- is there any need to wrap in Some?).

The "unsafe" versions would additionally permit an unapply method that
returns Option. The presence of the case keyword would make it clear
that a MatchError is possible in each situation.

I apologize in advance if this issue has already been recognized and
addressed. I'm hoping to get two things out of this email:
- comments from the community about the proposed syntax
- comments from the compiler folks about the feasibility of the proposal

Credit for pointing the problem out goes to Daniel Sobral in this SO
answer:
http://stackoverflow.com/questions/4380831/why-does-filter-have-to-be-defined-for-pattern-matching-in-a-for-loop-in-scala/4380925#4380925

~Aaron
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Erik Engbrecht
I'd say NPE is Scala's NPE...slightly tamed from Java but still alive and kicking.

MatchError is a new beast.

On Tue, Dec 7, 2010 at 6:24 PM, Aaron Novstrup <[hidden email]> wrote:
I realize the subject line is somewhat provocative, but I think this is a serious issue that could be addressed with better language/compiler support. I've noticed that the uses for partial functions and pattern matching have expanded beyond match blocks, creating a potential for unexpected MatchErrors or silent failures.

Consider the following code:

  val l = List(1 -> 2, 3 -> 4)
  val keys = for ((k,v) <- l) yield k

Suppose some absent-minded developer comes along, intending to add a pair (5 -> 6):

  val l = List(1 -> 2, 3 -> 4, 5)         // oops, forgot the " -> 6"
  val keys = for ((k, v) <- l) yield k    // 5 is silently ignored

I could have used map:

  val keys = l map { case (k, v) => k }   // throws MatchError!

Surprisingly, the same problem arises if I attempt to use destructuring:

  val keys = l map { tuple => val (k, v) = tuple; k }  // throws?!?

I think the fundamental issue here is that destructuring is unnecessarily tied to partial functions and is therefore needlessly unsafe.  Would it be possible to define a syntax to distinguish safe and unsafe destructuring?  For example,

Safe (generate compile error if types don't match):
  for ((k, v) <- l) yield k
  val (k, v) = tuple
  l map { ((k, v)) => k }

Unsafe:
  for (case (k, v) <- l) yield k    // filter out mismatches
  val case (k, v) = anything        // throw on mismatch
  l map { case (k, v) => k }        // throw on mismatch

The "safe" versions would expect an unapply method that returns Some (or just a tuple -- is there any need to wrap in Some?).

The "unsafe" versions would additionally permit an unapply method that returns Option. The presence of the case keyword would make it clear that a MatchError is possible in each situation.

I apologize in advance if this issue has already been recognized and addressed. I'm hoping to get two things out of this email:
- comments from the community about the proposed syntax
- comments from the compiler folks about the feasibility of the proposal

Credit for pointing the problem out goes to Daniel Sobral in this SO answer: http://stackoverflow.com/questions/4380831/why-does-filter-have-to-be-defined-for-pattern-matching-in-a-for-loop-in-scala/4380925#4380925

~Aaron



--
http://erikengbrecht.blogspot.com/
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Paul Phillips-3
In reply to this post by Aaron Novstrup

On Tue, Dec 07, 2010 at 03:24:30PM -0800, Aaron Novstrup wrote:
> I apologize in advance if this issue has already been recognized and
> addressed.

OK, one out of two.

There are a few spots I've been able to squeeze out some compile time
errors, such as this formerly horrifying example:

// 2.7.7
scala> val (x: Int, y: Long) = (5, 5)
scala.MatchError: (5,5)
        at .<init>(<console>:4)

// 2.8.0
scala> val (x: Int, y: Long) = (5, 5)
<console>:5: error: scrutinee is incompatible with pattern type;
 found   : Long
 required: Int
       val (x: Int, y: Long) = (5, 5)
                       ^

> Safe (generate compile error if types don't match):
>    for ((k, v) <- l) yield k
> Unsafe:
>    for (case (k, v) <- l) yield k    // filter out mismatches

That's exactly the syntax which was proposed the last couple times this
was bandied about.

See these tickets:

  https://lampsvn.epfl.ch/trac/scala/ticket/140
  https://lampsvn.epfl.ch/trac/scala/ticket/900

See this thread:

  http://www.scala-lang.org/node/995

There is a lot of past discussion but as usual it's not that easy to see
the whole picture at once.

--
Paul Phillips      | It is hard to believe that a man is
Apatheist          | telling the truth when you know that you
Empiricist         | would lie if you were in his place.
i pull his palp!   |     -- H. L. Mencken
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Aaron Novstrup
On 12/07/2010 04:00 PM, Paul Phillips wrote:
> On Tue, Dec 07, 2010 at 03:24:30PM -0800, Aaron Novstrup wrote:
>> I apologize in advance if this issue has already been recognized and
>> addressed.
>
> OK, one out of two.

Maybe I should have said "or"? :)

>> Safe (generate compile error if types don't match):
>>     for ((k, v)<- l) yield k
>> Unsafe:
>>     for (case (k, v)<- l) yield k    // filter out mismatches
>
> That's exactly the syntax which was proposed the last couple times this
> was bandied about.
>
>[snip]
>
> See this thread:
>
>    http://www.scala-lang.org/node/995

My sense after reading that thread is that using the "case" keyword had
the most traction at the time, although not with Martin who saw it as a
special case.  I wonder whether he still feels that way about it.  In
what sense is consistent, explicit notation for unsafe pattern matching
a "special case"?

In any case, that discussion took place in early 2009, so I think it
would be useful to hear what the community has to say about the issue in
late 2010.

Paul, do you have a sense of how involved it would be to make these
changes in the compiler/library? From a user's perspective, the effects
of the language change would be fairly minimal since existing code would
either a) still work, b) fail with a compile error that could be fixed
by adding 'case', or c) fail with a compile error indicating that the
existing code was broken.

Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Russ P.
In reply to this post by Paul Phillips-3
Here's what I find more disturbing:

scala> val (x, y) = (4, 5)
x: Int = 4
y: Int = 5

scala> val (X, Y) = (7, 8)
<console>:5: error: not found: value X
       val (X, Y) = (7, 8)
            ^
<console>:5: error: not found: value Y
       val (X, Y) = (7, 8)
               ^

Apparently you can't use variables names that start with caps with this syntax. Yeah, I realize that this has been discussed before, and it's not some huge deal, but it does seem rather strange to me.

Russ P.


On Tue, Dec 7, 2010 at 4:00 PM, Paul Phillips <[hidden email]> wrote:

On Tue, Dec 07, 2010 at 03:24:30PM -0800, Aaron Novstrup wrote:
> I apologize in advance if this issue has already been recognized and
> addressed.

OK, one out of two.

There are a few spots I've been able to squeeze out some compile time
errors, such as this formerly horrifying example:

// 2.7.7
scala> val (x: Int, y: Long) = (5, 5)
scala.MatchError: (5,5)
       at .<init>(<console>:4)

// 2.8.0
scala> val (x: Int, y: Long) = (5, 5)
<console>:5: error: scrutinee is incompatible with pattern type;
 found   : Long
 required: Int
      val (x: Int, y: Long) = (5, 5)
                      ^

> Safe (generate compile error if types don't match):
>    for ((k, v) <- l) yield k
> Unsafe:
>    for (case (k, v) <- l) yield k    // filter out mismatches

That's exactly the syntax which was proposed the last couple times this
was bandied about.

See these tickets:

 https://lampsvn.epfl.ch/trac/scala/ticket/140
 https://lampsvn.epfl.ch/trac/scala/ticket/900

See this thread:

 http://www.scala-lang.org/node/995

There is a lot of past discussion but as usual it's not that easy to see
the whole picture at once.

--
Paul Phillips      | It is hard to believe that a man is
Apatheist          | telling the truth when you know that you
Empiricist         | would lie if you were in his place.
i pull his palp!   |     -- H. L. Mencken



--
http://RussP.us
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Paul Phillips-3
In reply to this post by Aaron Novstrup
On Tue, Dec 07, 2010 at 03:24:30PM -0800, Aaron Novstrup wrote:
> Paul, do you have a sense of how involved it would be to make these
> changes in the compiler/library? From a user's perspective, the
> effects of the language change would be fairly minimal since existing
> code would either a) still work, b) fail with a compile error that
> could be fixed by adding 'case', or c) fail with a compile error
> indicating that the existing code was broken.

Despite the fact that I never did get the parameterized version of #900
working, I'll go out on a limb and say it wouldn't be difficult.  (The
rest is done, assuming I can find it after all this time.) You can
safely focus on the real challenge of selling martin.

--
Paul Phillips      | Simplicity and elegance are unpopular because
Apatheist          | they require hard work and discipline to achieve
Empiricist         | and education to be appreciated.
i pull his palp!   |     -- Dijkstra
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Erik Engbrecht
In reply to this post by Russ P.
Variables with names that begin with capital letters are presumed to be stable identifiers for pattern matching.

Consider:
val x: Option[Int] = //None or Some

x match {
  case Some(x) => // x is bound to the int value in Some
  case None => // the value of x is equality checked against None
}

You don't want None bound to the value of x, you want x checked against None.  The capital letters allow the compiler to disambiguate between "bind to the variable name" and "check against this pattern."

Of course if you're used to Python it would never occur to you that the code you show involves pattern matching.  I know I've struggled with this and spent a fair amount of time studying the spec and probably still don't completely understand it.  But I figure the pattern matcher still has a lot of dark corners in it, and if I did manage to really deeply understand pattern matching, the first thing I'd do is frustrate myself with finding all those dark corners.  And then maybe Paul would declare that he has to go spend a year meditating on the top of a mountain in order to figure out how to redesign the pattern matcher.


On Tue, Dec 7, 2010 at 8:29 PM, Russ Paielli <[hidden email]> wrote:
Here's what I find more disturbing:

scala> val (x, y) = (4, 5)
x: Int = 4
y: Int = 5

scala> val (X, Y) = (7, 8)
<console>:5: error: not found: value X
       val (X, Y) = (7, 8)
            ^
<console>:5: error: not found: value Y
       val (X, Y) = (7, 8)
               ^

Apparently you can't use variables names that start with caps with this syntax. Yeah, I realize that this has been discussed before, and it's not some huge deal, but it does seem rather strange to me.

Russ P.



On Tue, Dec 7, 2010 at 4:00 PM, Paul Phillips <[hidden email]> wrote:

On Tue, Dec 07, 2010 at 03:24:30PM -0800, Aaron Novstrup wrote:
> I apologize in advance if this issue has already been recognized and
> addressed.

OK, one out of two.

There are a few spots I've been able to squeeze out some compile time
errors, such as this formerly horrifying example:

// 2.7.7
scala> val (x: Int, y: Long) = (5, 5)
scala.MatchError: (5,5)
       at .<init>(<console>:4)

// 2.8.0
scala> val (x: Int, y: Long) = (5, 5)
<console>:5: error: scrutinee is incompatible with pattern type;
 found   : Long
 required: Int
      val (x: Int, y: Long) = (5, 5)
                      ^

> Safe (generate compile error if types don't match):
>    for ((k, v) <- l) yield k
> Unsafe:
>    for (case (k, v) <- l) yield k    // filter out mismatches

That's exactly the syntax which was proposed the last couple times this
was bandied about.

See these tickets:

 https://lampsvn.epfl.ch/trac/scala/ticket/140
 https://lampsvn.epfl.ch/trac/scala/ticket/900

See this thread:

 http://www.scala-lang.org/node/995

There is a lot of past discussion but as usual it's not that easy to see
the whole picture at once.

--
Paul Phillips      | It is hard to believe that a man is
Apatheist          | telling the truth when you know that you
Empiricist         | would lie if you were in his place.
i pull his palp!   |     -- H. L. Mencken



--
http://RussP.us



--
http://erikengbrecht.blogspot.com/
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Paul Phillips-3
On Tue, Dec 07, 2010 at 09:24:55PM -0500, Erik Engbrecht wrote:
> And then maybe Paul would declare that he has to go spend a year
> meditating on the top of a mountain in order to figure out how to
> redesign the pattern matcher.

Given that my plan has been to rewrite it in switzerland, you may be
onto something there.

--
Paul Phillips      | All men are frauds.  The only difference between
Analgesic          | them is that some admit it.  I myself deny it.
Empiricist         |     -- H. L. Mencken
all hip pupils!    |----------* http://www.improving.org/paulp/ *----------
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Russ P.
In reply to this post by Erik Engbrecht
On Tue, Dec 7, 2010 at 6:24 PM, Erik Engbrecht <[hidden email]> wrote:
Variables with names that begin with capital letters are presumed to be stable identifiers for pattern matching.


So can capitalized variable names bite me at run time, or will any potential problems with them be caught at compile time? I occasionally use capitalized variable names to be consistent with acronyms, such as "IFR: Boolean" (a flag to indicate whether an aircraft is flying under Instrument Flight Rules) or "RVSM: Boolean" (a flag to indicate whether an aircraft is equipped for Reduced Vertical Separation Minimum). Acronyms shown in lower case bother me for some reason.

Russ P.

--
http://RussP.us
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Erik Engbrecht
It depends on what you are expecting.  If you begin the name with a capital letter, the compiler will expect there to be a variable with that name in scope.

So if you have:
scala> val IFR = true
IFR: Boolean = true

scala> false match {
     |   case IFR => println("IFR")
     |   case false => println("false")
     | }
false

scala> true match {                    
     |   case IFR => println("IFR")    
     |   case false => println("false")
     | }
IFR

scala> true match {                  
     |   case RVSM => println("rvsm")
     |   case _ => println("_")
     | }
<console>:7: error: not found: value RVSM
         case RVSM => println("rvsm")
              ^


So using RVSM lead to a compile time error because there was no RVSM value in scope.  But IFR matched to the value I set to IFR, because IFR was in scope.


On Tue, Dec 7, 2010 at 9:45 PM, Russ Paielli <[hidden email]> wrote:
On Tue, Dec 7, 2010 at 6:24 PM, Erik Engbrecht <[hidden email]> wrote:
Variables with names that begin with capital letters are presumed to be stable identifiers for pattern matching.


So can capitalized variable names bite me at run time, or will any potential problems with them be caught at compile time? I occasionally use capitalized variable names to be consistent with acronyms, such as "IFR: Boolean" (a flag to indicate whether an aircraft is flying under Instrument Flight Rules) or "RVSM: Boolean" (a flag to indicate whether an aircraft is equipped for Reduced Vertical Separation Minimum). Acronyms shown in lower case bother me for some reason.

Russ P.

--
http://RussP.us



--
http://erikengbrecht.blogspot.com/
Reply | Threaded
Open this post in threaded view
|

Re: Is MatchError becoming Scala's NullPointerException?

Jim Balter
In reply to this post by Russ P.
On Tue, 07 Dec 2010 18:45:30 -0800, Russ Paielli wrote:

> On Tue, Dec 7, 2010 at 6:24 PM, Erik Engbrecht
> <[hidden email]>wrote:
>
>> Variables with names that begin with capital letters are presumed to be
>> stable identifiers for pattern matching.
>>
>>
> So can capitalized variable names bite me at run time, or will any
> potential problems with them be caught at compile time?

Sure it can bite you at run time ... if you use a defined uncapitalized
name expecting a match to its defined value. e.g.,

scala> val X = "X"                                                        
X: java.lang.String = X

scala> ("Y",0) match { case (X,0) => "case 1 "+X; case _ => "case 2 " +
X }
res0: java.lang.String = case 2 X

"X" doesn't match "Y", as expected, but

scala> val x = "x"
x: java.lang.String = x

scala> ("Y",0) match { case (x,0) => "case 1 "+x; case _ => "case 2 " +
x }
res1: java.lang.String = case 1 Y

x (inside the block; the other x is a different variable) receives the
value "Y". If you want to use the value of x from outside the block, you
have to use `x`:

scala> val x = "x"
x: java.lang.String = x

scala> ("Y",0) match { case (`x`,0) => "case 1 "+x; case _ => "case 2 " +
x }
res2: java.lang.String = case 2 x


This seems like poor language design, until you try to come up with a
decent alternative.

> I occasionally
> use capitalized variable names to be consistent with acronyms, such as
> "IFR: Boolean" (a flag to indicate whether an aircraft is flying under
> Instrument Flight Rules) or "RVSM: Boolean" (a flag to indicate whether
> an aircraft is equipped for Reduced Vertical Separation Minimum).
> Acronyms shown in lower case bother me for some reason.
>
> Russ P.

That's no problem; the ambiguity only occurs with *un*capitalized names.