Discussion:
Improving 'syntax error, unexpected $end, expecting kEND'?
Hugh Sasse
2007-10-18 18:01:55 UTC
Permalink
I've had a look at this, but can't see how to do it: When I get
syntax error, unexpected $end, expecting kEND
I know from experience that $end means "end of file" and kEND means
the lexical token "end". What I can't figure out from reading
parse.c is how the stack works and how the rules are stored. I think
knowing if it was expecting the 'end' from a 'class', 'def', 'begin'
'while', 'for', 'do', 'if', 'unless' or 'case' would help me narrow
down where I've failed to have a closing "end". If there's line
number information that would be even better. Indentation has not
solved this for me.

So is this too difficult, which is why is hasn't happened already?

Meanwhile I'll just break up my program into smaller bits.

Thank you,
Hugh
Paul Brannan
2007-10-22 21:20:36 UTC
Permalink
Post by Hugh Sasse
I've had a look at this, but can't see how to do it: When I get
syntax error, unexpected $end, expecting kEND
I know from experience that $end means "end of file" and kEND means
the lexical token "end". What I can't figure out from reading
IMO this is a pretty cryptic message. It gets easier to diagnose the
more you see the message, but I'd rather see a message like:

Syntax error: unexpected end of file while looking for matching 'end'
Post by Hugh Sasse
parse.c is how the stack works and how the rules are stored. I think
knowing if it was expecting the 'end' from a 'class', 'def', 'begin'
'while', 'for', 'do', 'if', 'unless' or 'case' would help me narrow
down where I've failed to have a closing "end". If there's line
number information that would be even better. Indentation has not
solved this for me.
Indentation may not solve the problem, but it does help.

When all else fails I comment out sections of code until the code
compiles; when that happens, I know the offending code is probably in
the code block I most recently commented out.

I don't think that specifying which element was missing the end
necessarily helps track down the problem. Consider:

class Foo
def foo
while true
end
return 42
end
1+2

With your proposed change I might expect to see something like:

Syntax error: unexpected end of file; expecting matching 'end' for 'class Foo'

but blindly putting an 'end' after bar() might be wrong; the user might
have meant either:

class Foo
def foo
while true
end
return 42
end
bar()
end

or:

class Foo
def foo
while true
end
return 42
end
end
bar()

but it's impossible to know without probing the user's brain (though
analyzing indentation could possibly help).

Paul
Michal Suchanek
2007-10-23 11:05:49 UTC
Permalink
Post by Paul Brannan
Syntax error: unexpected end of file; expecting matching 'end' for 'class Foo'
but blindly putting an 'end' after bar() might be wrong; the user might
class Foo
def foo
while true
end
return 42
end
bar()
end
class Foo
def foo
while true
end
return 42
end
end
bar()
but it's impossible to know without probing the user's brain (though
analyzing indentation could possibly help).
Yes, but the fact that you have unclosed class Foo hints quite a bit
where the error might be. If you have some testing code below the
class you can be quite sure that the missing end is inside the class
definition. If you have multiple classes you would get nested class
warning (or not) which should help you tell which one is broken. Even
then, saying which class is not closed is better than not giving a
warning about nested classes - not everybody knows that.
Of course, the error might be in quite surprising places. But stating
what is the top unclosed element makes it certainly easier to
diagnose.

Thanks

Michal
Rick DeNatale
2007-10-23 12:41:40 UTC
Permalink
Post by Paul Brannan
I don't think that specifying which element was missing the end
class Foo
def foo
while true
end
return 42
end
1+2
Syntax error: unexpected end of file; expecting matching 'end' for 'class Foo'
but blindly putting an 'end' after bar() might be wrong; the user might
class Foo
def foo
while true
end
return 42
end
bar()
end
class Foo
def foo
while true
end
return 42
end
end
bar()
but it's impossible to know without probing the user's brain (though
analyzing indentation could possibly help).
Way back in my college days (I'm amazed that my senility hasn't
advanced to the point where I can't remember that far back<g>) I took
a programming course which used a slightly simplified academic version
of PL/1 from Cornell called PL/C.

This was back in the day where you submitted batch jobs, and got a
printout to study and debug.

The PL/C compiler tried as hard as it could to correct syntax errors
to get the "most" out of each run.

It would put out messages like:

20: A line with a syntax error
SYNTAX ERROR ON LINE 20 ....
PL/C USES:
pl/c's guess of what you meant.

More often that not, IIRC this produced more amusing cascading errors
than a real solution.
--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/
Hugh Sasse
2007-10-23 15:31:05 UTC
Permalink
Post by Paul Brannan
I don't think that specifying which element was missing the end
[class Foo;def foo;while true;end;return 42;end;1+2]
Post by Paul Brannan
Syntax error: unexpected end of file; expecting matching 'end' for 'class Foo'
but blindly putting an 'end' after bar() might be wrong; the user might
Agreed. Doing anything without consideration will cause problems. But
at least it tells the user where the interpreter began to get in knots.
[examples trimmed]
Post by Paul Brannan
but it's impossible to know without probing the user's brain (though
analyzing indentation could possibly help).
Way back in my college days ([...]) I took
a programming course which used a slightly simplified academic version
of PL/1 from Cornell called PL/C.
[...]
The PL/C compiler tried as hard as it could to correct syntax errors
to get the "most" out of each run.
[...]
More often that not, IIRC this produced more amusing cascading errors
than a real solution.
Agreed it is a less than perfect solution, but in terms of figuring
out what the interpreter is doing, it is more diagnostic info than
we get now. We know that the open and closing statements are
balanced between the statement after that given and end of file.

Hugh
David Flanagan
2007-10-23 19:15:05 UTC
Permalink
The patch below changes this message to:

syntax error, unexpected "end-of-file", expecting "end"

It doesn't tell you where the outermost open block begins, but at least
the message isn't so cryptic. I don't know any way to get rid of the
quotes around "end-of-file".

David

Index: parse.y
===================================================================
--- parse.y (revision 13760)
+++ parse.y (working copy)
@@ -596,6 +596,7 @@
/*%
%token <val>
%*/
+ end_of_file 0 "end-of-file"
keyword_class
keyword_module
keyword_def
@@ -603,7 +604,7 @@
keyword_begin
keyword_rescue
keyword_ensure
- keyword_end
+ keyword_end "end"
keyword_if
keyword_unless
keyword_then
Post by Hugh Sasse
I've had a look at this, but can't see how to do it: When I get
syntax error, unexpected $end, expecting kEND
I know from experience that $end means "end of file" and kEND means
the lexical token "end". What I can't figure out from reading
parse.c is how the stack works and how the rules are stored. I think
knowing if it was expecting the 'end' from a 'class', 'def', 'begin'
'while', 'for', 'do', 'if', 'unless' or 'case' would help me narrow
down where I've failed to have a closing "end". If there's line
number information that would be even better. Indentation has not
solved this for me.
So is this too difficult, which is why is hasn't happened already?
Meanwhile I'll just break up my program into smaller bits.
Thank you,
Hugh
Martin Duerst
2007-10-24 06:58:14 UTC
Permalink
Post by David Flanagan
syntax error, unexpected "end-of-file", expecting "end"
It doesn't tell you where the outermost open block begins, but at least the message isn't so cryptic. I don't know any way to get rid of the quotes around "end-of-file".
This looks extremely helpful, in particular for (relative) beginners.
Looking at the patch below, it seems to be easy to improve things for
a few other cases, too. I just tried to fill in the blanks; maybe
some of this stuff doesn't make sense, but I have included a patch
below.

As for personal experiences, when I get an error of this kind,
I often add an "end" at some arbitrary place and see what happens.
Then I move that "end" around a bit and see again what happens.
That usually leads to a solution pretty quickly. Seems that sometimes
randomized algorithms are useful even for programming :-).

Regards, Martin.


Index: parse.y
===================================================================
--- parse.y (revision 13764)
+++ parse.y (working copy)
@@ -596,54 +596,55 @@
/*%
%token <val>
%*/
- keyword_class
- keyword_module
- keyword_def
- keyword_undef
- keyword_begin
- keyword_rescue
- keyword_ensure
- keyword_end
- keyword_if
- keyword_unless
- keyword_then
- keyword_elsif
- keyword_else
- keyword_case
- keyword_when
- keyword_while
- keyword_until
- keyword_for
- keyword_break
- keyword_next
- keyword_redo
- keyword_retry
- keyword_in
- keyword_do
- keyword_do_cond
- keyword_do_block
+ end_of_file 0 "end-of-file"
+ keyword_class "class"
+ keyword_module "module"
+ keyword_def "def"
+ keyword_undef "undef"
+ keyword_begin "begin"
+ keyword_rescue "rescue"
+ keyword_ensure "ensure"
+ keyword_end "end"
+ keyword_if "if"
+ keyword_unless "unless"
+ keyword_then "then"
+ keyword_elsif "elsif"
+ keyword_else "else"
+ keyword_case "case"
+ keyword_when "when"
+ keyword_while "while"
+ keyword_until "until"
+ keyword_for "for"
+ keyword_break "break"
+ keyword_next "next"
+ keyword_redo "redo"
+ keyword_retry "retry"
+ keyword_in "in"
+ keyword_do "do"
+ keyword_do_cond "do"
+ keyword_do_block "do"
keyword_do_LAMBDA
- keyword_return
- keyword_yield
- keyword_super
- keyword_self
- keyword_nil
- keyword_true
- keyword_false
- keyword_and
- keyword_or
- keyword_not
- modifier_if
- modifier_unless
- modifier_while
- modifier_until
- modifier_rescue
- keyword_alias
- keyword_defined
- keyword_BEGIN
- keyword_END
- keyword__LINE__
- keyword__FILE__
+ keyword_return "return"
+ keyword_yield "yield"
+ keyword_super "super"
+ keyword_self "self"
+ keyword_nil "nil"
+ keyword_true "true"
+ keyword_false "false"
+ keyword_and "and"
+ keyword_or "or"
+ keyword_not "not"
+ modifier_if "if"
+ modifier_unless "unless"
+ modifier_while "while"
+ modifier_until "until"
+ modifier_rescue "rescue"
+ keyword_alias "alias"
+ keyword_defined "defined"
+ keyword_BEGIN "BEGIN"
+ keyword_END "END"
+ keyword__LINE__ "__LINE__"
+ keyword__FILE__ "__FILE__"

%token <id> tIDENTIFIER tFID tGVAR tIVAR tCONSTANT tCVAR tLABEL
%token <node> tINTEGER tFLOAT tSTRING_CONTENT tCHAR



#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp
David Flanagan
2007-10-24 07:57:54 UTC
Permalink
Thanks for filling these in Martin. I worry that this is such a simple
thing that there must be some reason it wasn't done before.... Impact
on performance? Breaking the Ripper code?

It occurs to me that maybe the end-of-file line should just be

end_of_file 0

instead of

end_of_file 0 "end-of-file"

that ought to get rid of the misleading quotes that make "end-of-file"
appear like a keyword. I tried using "EOF" as the token name, but there
was already a macro with that name and that messed everything up.

David
Post by Martin Duerst
Post by David Flanagan
syntax error, unexpected "end-of-file", expecting "end"
It doesn't tell you where the outermost open block begins, but at least the message isn't so cryptic. I don't know any way to get rid of the quotes around "end-of-file".
This looks extremely helpful, in particular for (relative) beginners.
Looking at the patch below, it seems to be easy to improve things for
a few other cases, too. I just tried to fill in the blanks; maybe
some of this stuff doesn't make sense, but I have included a patch
below.
As for personal experiences, when I get an error of this kind,
I often add an "end" at some arbitrary place and see what happens.
Then I move that "end" around a bit and see again what happens.
That usually leads to a solution pretty quickly. Seems that sometimes
randomized algorithms are useful even for programming :-).
Regards, Martin.
Index: parse.y
===================================================================
--- parse.y (revision 13764)
+++ parse.y (working copy)
@@ -596,54 +596,55 @@
/*%
%token <val>
%*/
- keyword_class
- keyword_module
- keyword_def
- keyword_undef
- keyword_begin
- keyword_rescue
- keyword_ensure
- keyword_end
- keyword_if
- keyword_unless
- keyword_then
- keyword_elsif
- keyword_else
- keyword_case
- keyword_when
- keyword_while
- keyword_until
- keyword_for
- keyword_break
- keyword_next
- keyword_redo
- keyword_retry
- keyword_in
- keyword_do
- keyword_do_cond
- keyword_do_block
+ end_of_file 0 "end-of-file"
+ keyword_class "class"
+ keyword_module "module"
+ keyword_def "def"
+ keyword_undef "undef"
+ keyword_begin "begin"
+ keyword_rescue "rescue"
+ keyword_ensure "ensure"
+ keyword_end "end"
+ keyword_if "if"
+ keyword_unless "unless"
+ keyword_then "then"
+ keyword_elsif "elsif"
+ keyword_else "else"
+ keyword_case "case"
+ keyword_when "when"
+ keyword_while "while"
+ keyword_until "until"
+ keyword_for "for"
+ keyword_break "break"
+ keyword_next "next"
+ keyword_redo "redo"
+ keyword_retry "retry"
+ keyword_in "in"
+ keyword_do "do"
+ keyword_do_cond "do"
+ keyword_do_block "do"
keyword_do_LAMBDA
- keyword_return
- keyword_yield
- keyword_super
- keyword_self
- keyword_nil
- keyword_true
- keyword_false
- keyword_and
- keyword_or
- keyword_not
- modifier_if
- modifier_unless
- modifier_while
- modifier_until
- modifier_rescue
- keyword_alias
- keyword_defined
- keyword_BEGIN
- keyword_END
- keyword__LINE__
- keyword__FILE__
+ keyword_return "return"
+ keyword_yield "yield"
+ keyword_super "super"
+ keyword_self "self"
+ keyword_nil "nil"
+ keyword_true "true"
+ keyword_false "false"
+ keyword_and "and"
+ keyword_or "or"
+ keyword_not "not"
+ modifier_if "if"
+ modifier_unless "unless"
+ modifier_while "while"
+ modifier_until "until"
+ modifier_rescue "rescue"
+ keyword_alias "alias"
+ keyword_defined "defined"
+ keyword_BEGIN "BEGIN"
+ keyword_END "END"
+ keyword__LINE__ "__LINE__"
+ keyword__FILE__ "__FILE__"
%token <id> tIDENTIFIER tFID tGVAR tIVAR tCONSTANT tCVAR tLABEL
%token <node> tINTEGER tFLOAT tSTRING_CONTENT tCHAR
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
Martin Duerst
2007-10-25 07:04:18 UTC
Permalink
Post by David Flanagan
Thanks for filling these in Martin.
I'm sorry, but I should have compiled these earlier.
I actually get some warnings because some strings (e.g. "if", "do")
appear multiple times.
Post by David Flanagan
I worry that this is such a simple thing that there must be some reason it wasn't done before.... Impact on performance? Breaking the Ripper code?
One reason I suspect is yacc. As an example,
http://dinosaur.compilertools.net/yacc/index.html,
which gives explicit yacc syntax in appendix C, doesn't mention
this facility.

Regards, Martin.
Post by David Flanagan
It occurs to me that maybe the end-of-file line should just be
end_of_file 0
instead of
end_of_file 0 "end-of-file"
that ought to get rid of the misleading quotes that make "end-of-file" appear like a keyword. I tried using "EOF" as the token name, but there was already a macro with that name and that messed everything up.
David
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:***@it.aoyama.ac.jp
Robin Stocker
2007-10-25 22:01:53 UTC
Permalink
Post by Martin Duerst
Post by David Flanagan
I worry that this is such a simple thing that there must be some reason it wasn't done before.... Impact on performance? Breaking the Ripper code?
One reason I suspect is yacc. As an example,
http://dinosaur.compilertools.net/yacc/index.html,
which gives explicit yacc syntax in appendix C, doesn't mention
this facility.
Hi all,

I made an attempt to fix this issue one and a half year ago:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/7653

Here's a summary:

- Only Bison understands the syntax with the descriptive name in quotes.
- So I made a patch to the Makefile which strips the "foo" from parse.y
if we aren't using Bison and then feeds that to the parser generator.
- The patch used GNU make extensions which wasn't acceptable, so I gave
it another try with standard shell features:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/7695

- There was no response, I got discouraged and gave up.

Now that there is interest in it again, maybe it can be resolved once
and for all. Would it help if I updated my old patch?

By the way, I can't understand why the core developers apparently aren't
interested in fixing this wart.

Regards,
Robin Stocker
Nobuyoshi Nakada
2007-10-26 01:29:47 UTC
Permalink
Hi,

At Fri, 26 Oct 2007 07:01:53 +0900,
Post by Robin Stocker
- The patch used GNU make extensions which wasn't acceptable, so I gave
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/7695
- There was no response, I got discouraged and gave up.
Sorry, I've missed it.
Post by Robin Stocker
Now that there is interest in it again, maybe it can be resolved once
and for all. Would it help if I updated my old patch?
In 1.9, bison is now required for ripper, so it won't be necessary to
strip them.
--
Nobu Nakada
David Flanagan
2007-10-26 06:09:30 UTC
Permalink
Post by Nobuyoshi Nakada
Hi,
At Fri, 26 Oct 2007 07:01:53 +0900,
Post by Robin Stocker
- The patch used GNU make extensions which wasn't acceptable, so I gave
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/7695
- There was no response, I got discouraged and gave up.
Sorry, I've missed it.
Post by Robin Stocker
Now that there is interest in it again, maybe it can be resolved once
and for all. Would it help if I updated my old patch?
In 1.9, bison is now required for ripper, so it won't be necessary to
strip them.
Well, then. Attached is an updated version of Robin's patch. I made
some small changes to some of the token names Robin had picked. In
particular, I avoided ambiguity when two tokens represented different
uses of the same keyword by adding spaces. E.g. "if" is the if
statement and "if " is the if modifier. bison won't let me use the same
string for both tokens, but I suspect that simpler is better in error
messages. Also, added a new end_of_file token hardcoded as token 0
which is what bison seems to use. I didn't give this one a
double-quoted name because the double quotes get included in the error
messages (with my version of bison at least) and I felt that this:

unexpected end_of_file

is clearer than:

unexpected "end of file"

This seems to work okay. make test works, except for tests in
bootstraptest/test_syntax.rb which explicitly test the content of error
messages. If this patch is accepted, I'll patch the test cases to match.

David

David
Robin Stocker
2007-10-26 12:36:31 UTC
Permalink
Post by David Flanagan
Post by Nobuyoshi Nakada
Hi,
At Fri, 26 Oct 2007 07:01:53 +0900,
Post by Robin Stocker
- The patch used GNU make extensions which wasn't acceptable, so I gave
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/7695
- There was no response, I got discouraged and gave up.
Sorry, I've missed it.
Post by Robin Stocker
Now that there is interest in it again, maybe it can be resolved once
and for all. Would it help if I updated my old patch?
In 1.9, bison is now required for ripper, so it won't be necessary to
strip them.
Thanks Nobu, that's good news.
Post by David Flanagan
Well, then. Attached is an updated version of Robin's patch. I made
some small changes to some of the token names Robin had picked. In
particular, I avoided ambiguity when two tokens represented different
uses of the same keyword by adding spaces. E.g. "if" is the if
statement and "if " is the if modifier. bison won't let me use the same
string for both tokens, but I suspect that simpler is better in error
messages.
How do the error messages look for these cases? Are the spaces preserved
and do they show up in the messages?
Post by David Flanagan
Also, added a new end_of_file token hardcoded as token 0
which is what bison seems to use. I didn't give this one a
double-quoted name because the double quotes get included in the error
unexpected end_of_file
unexpected "end of file"
This seems to work okay. make test works, except for tests in
bootstraptest/test_syntax.rb which explicitly test the content of error
messages. If this patch is accepted, I'll patch the test cases to match.
David
David
David Flanagan
2007-10-26 17:47:26 UTC
Permalink
Post by Robin Stocker
Post by David Flanagan
Well, then. Attached is an updated version of Robin's patch. I made
some small changes to some of the token names Robin had picked. In
particular, I avoided ambiguity when two tokens represented different
uses of the same keyword by adding spaces. E.g. "if" is the if
statement and "if " is the if modifier. bison won't let me use the same
string for both tokens, but I suspect that simpler is better in error
messages.
How do the error messages look for these cases? Are the spaces preserved
and do they show up in the messages?
Yes, the spaces are preserved.

With this patch, if I do ruby -e "class rescue", I get

-e:1: syntax error, unexpected "rescue "

The space is weird, but I think (or at least I thought last night) that
it is better than this:

-e:1: syntax error, unexpected "rescue modifier"

To me, the quotes imply that the word "modifier" appeared in the program
text. (I really wish that bison allowed us to embed our own quotes in
the token names when we wanted to and didn't insert them for us--this is
a bigger problem for tokens that don't map to individual keywords or
operators.)

Maybe a way to avoid the spaces would be to choose some non-printing
ASCII character instead of space. (like ^G, to ring the terminal bell
on errors :-)

Or, maybe we just use the same string "rescue" for both the statement
and modifiers forms of the keyword and just live with the warning that
bison issues. That is a little scary, though since apparently the
strings in quotes can be used as alternatives to the token identifiers
in the grammar itself, so there really ought to be a one-to-one mapping.

Another possibility is to hack yyerror to clean up the error messages
for us, possibly stripping the quotation marks (so that we could have
"\"rescue\" modifier" with only the keyword in quotes and removing any
hacky spaces or non-printing characters we stuck into the token strings
to fool bison. I haven't done any real C string manipulation complete
with memory allocation and freeing in 10 years, however, so I'm a little
reluctant to attempt this myself...

David

David
David Flanagan
2007-10-26 18:23:16 UTC
Permalink
Post by David Flanagan
Another possibility is to hack yyerror to clean up the error messages
for us, possibly stripping the quotation marks (so that we could have
"\"rescue\" modifier" with only the keyword in quotes and removing any
hacky spaces or non-printing characters we stuck into the token strings
to fool bison. I haven't done any real C string manipulation complete
with memory allocation and freeing in 10 years, however, so I'm a little
reluctant to attempt this myself...
Okay, despite my protests of not knowing how to do this, I figured it
out. The yyerror function already uses ALLOCA_N once to allocate a
string on the stack, so I don't feel bad about using it again.

Now, before printing the error message it strips all double quotes out
of it. So we can get error messages like "unexpected 'rescue' modifier"
And I can get "end of file" in an error message without quotes.
Furthermore, we can solve the issue of having to have unique strings for
each token by just inserting additional escaped double-quotes characters
where I was using spaces before. They'll be stripped out.

Its kind of hacky, but it makes for nice error messages. Updated patch
is attached.

David
Nobuyoshi Nakada
2007-10-26 18:46:32 UTC
Permalink
Hi,

At Wed, 24 Oct 2007 04:15:05 +0900,
Post by David Flanagan
syntax error, unexpected "end-of-file", expecting "end"
It doesn't tell you where the outermost open block begins, but at least
the message isn't so cryptic. I don't know any way to get rid of the
quotes around "end-of-file".
Which version of bison do you use? Bison 2.3 seems to strip
the quotes.
--
Nobu Nakada
David Flanagan
2007-10-26 19:07:38 UTC
Permalink
Post by Nobuyoshi Nakada
Hi,
At Wed, 24 Oct 2007 04:15:05 +0900,
Post by David Flanagan
syntax error, unexpected "end-of-file", expecting "end"
It doesn't tell you where the outermost open block begins, but at least
the message isn't so cryptic. I don't know any way to get rid of the
quotes around "end-of-file".
Which version of bison do you use? Bison 2.3 seems to strip
the quotes.
I've got bison 2.0 by default on my Fedora Core 4 system. See my most
recent post on this thread for a yyerror hack that strips the quotes for
older systems like mine.

David

Loading...