Discussion:
std::stringstream and eof() strangeness
(too old to reply)
c***@gmail.com
2007-05-25 08:09:08 UTC
Permalink
Hello group,

I'm writing a lexer taking input from a character stream. For testing
purposes, I frequently use std::stringstream, since it's easy to get
the input I want to test my lexer with.

I expected something like this:

stringstream ss("");
assert (ss.eof());

But it turns out not to be true. However, this works:

stringstream ss("");
ss.peek();
assert (ss.eof());

I am surprised. This forces me to call peek() when I start lexing,
which I fell is a hack.

So since ss.peek() is not supposed to change the stream, how come
eof() returns different results? Is this my implementation (GCC
3.4.5.) or is it standard?

Carl
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Carl Barron
2007-05-25 10:55:05 UTC
Permalink
Post by c***@gmail.com
Hello group,
I'm writing a lexer taking input from a character stream. For testing
purposes, I frequently use std::stringstream, since it's easy to get
the input I want to test my lexer with.
stringstream ss("");
assert (ss.eof());
the stream has not been accessed so eof has not occurred.
Post by c***@gmail.com
stringstream ss("");
ss.peek();
assert (ss.eof());
the stream attempts to reach the first [non existant] char of the
stream and reports failbit for faiilure and eofbit since eof has
occurred.
so the proper approach is to test failure. fail(),bad() functions
or either/both by conversion to bool.
if the failbit is set (s.fail() ==true) then test for eof.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Ulrich Eckhardt
2007-05-25 16:18:49 UTC
Permalink
Post by c***@gmail.com
I'm writing a lexer taking input from a character stream. For testing
purposes, I frequently use std::stringstream, since it's easy to get
the input I want to test my lexer with.
stringstream ss("");
assert (ss.eof());
eof() returns true set as soon as an operation(!) reached EOF. You didn't do
anything with that stream, so no the bit isn't set. I think you could try
the same with an empty file and it should behave the same.
Post by c***@gmail.com
stringstream ss("");
ss.peek();
assert (ss.eof());
I am surprised. This forces me to call peek() when I start lexing,
which I fell is a hack.
Hmmm, maybe you are writing your lexer the wrong way. What doesn't work is
this:

while(!in.eof()) {
in >> token;
use(token);
}

This rather has to be this:

while(in>>token) {
use(token);
}

Or, if you're working characterwise:

while(in.get(character)) {
use(character);
}
Post by c***@gmail.com
So since ss.peek() is not supposed to change the stream, how come
eof() returns different results?
It doesn't change the position in the input sequence, but that's all. After
all, it might require IO with an external device, so it must be allowed to
change things.

Uli
--
Sator Laser GmbH
Gesch�ftsf�hrer: Ronald Boers, Amtsgericht Hamburg HR B62 932


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Alberto Ganesh Barbati
2007-05-25 16:21:51 UTC
Permalink
Post by c***@gmail.com
stringstream ss("");
assert (ss.eof());
stringstream ss("");
ss.peek();
assert (ss.eof());
I am surprised. This forces me to call peek() when I start lexing,
which I fell is a hack.
So since ss.peek() is not supposed to change the stream, how come
eof() returns different results? Is this my implementation (GCC
3.4.5.) or is it standard?
It's standard. The semantic of eof() is sometimes misleading. The
standard says in 27.4.2.1.3: "eofbit indicates that an input operation
reached the end of an input sequence." So:

1) eofbit won't be set until you perform at least one input operation

2) peek() is not supposed to change the *position* in the input
sequence, but it can change the state of the stream. In particular,
being an "input operation" it shall set eofbit if EOF is reached

My experience is that you very rarely need to check eof(). A few legacy
I/O patterns requires that you check for EOF before attempting to read,
and then perform the read. This is not the smartest way to work with C++
streams. With C++ streams you usually just perform the read operation
and *after that* you check for failure, either through an explicit call
to fail() or by using the implicit conversion to void*. For example:

if(in >> s)
{
// success
}
else
{
// failure
}

Usually you don't care why the operation failed and this snippet is
enough. However, you can check in the "failure" branch the value of
eof() to disambiguate whether the failure occurred because EOF is
reached or some other reason.

HTH,

Ganesh
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Seungbeom Kim
2007-05-26 10:21:39 UTC
Permalink
Post by c***@gmail.com
stringstream ss("");
assert (ss.eof());
stringstream ss("");
ss.peek();
assert (ss.eof());
I am surprised. This forces me to call peek() when I start lexing,
which I fell is a hack.
Can you tell me where you learned to use eof() for such a purpose?
I'm very curious about it, because I have seen many people trying
eof() or the similar feof() that way even though they were never
meant to be used that way (i.e. testing before any read operation).
--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
s***@roguewave.com
2007-05-26 10:22:48 UTC
Permalink
On May 25, 2:09 am, ***@gmail.com wrote:
[...]
Post by c***@gmail.com
stringstream ss("");
assert (ss.eof());
[...]
Post by c***@gmail.com
So since ss.peek() is not supposed to change the stream, how come
eof() returns different results? Is this my implementation (GCC
3.4.5.) or is it standard?
To find the required behavior in the first case you need to look at
the
description of the stringstream constructor [stringstream.cons],
follow
that to the iostream constructor [iostream.cons], then further to
istream
and ostream constructors [istream.cons] and [ostream.cons],
respectively,
and finally to ios::init() described in [basic.ios.cons]. There, in
Table
110, is the initial value required to be returned by the stream member
function rdstate(), which in your case is goodbit since the
stringstream
ctor calls init() with the value of its non-zero rdbuf(). The only
other
possible value is badbit, when the argument to init() is null. eofbit
is
never set by any standard stream ctor.

You are correct that according to the standard peek() is not supposed
to
set eofbit. The handling of error conditions (including EOF) in
practice
is not entirely consistent with the requirements, but the requirements
aren't always completely consistent either. There is an issue to
clarify
the spec and make the required behavior self-consistent. See
http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#398
If you see an inconsistency in the implementation you use I suggest
you
let its vendor or maintainers know to help them decide which way to go
when it comes to dealing with the issue.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Alberto Ganesh Barbati
2007-05-27 17:27:30 UTC
Permalink
Post by s***@roguewave.com
You are correct that according to the standard peek() is not supposed
to set eofbit.
Could you please elaborate on that? My understanding is exactly the
opposite. See 27/3:

"If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then
the input function, except as explicitly noted otherwise, completes its
actions and does setstate(eofbit), which may throw ios_base::failure
(27.4.4.3), before returning."

As peek() is implemented through sgetc() (27.6.1.3/23) and it's not
"explicitly noted otherwise", I interpret this as a clear requirement
that peek() shall effectively set eofbit on EOF.

What am I missing?

Ganesh
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
s***@roguewave.com
2007-05-28 01:24:29 UTC
Permalink
Post by Alberto Ganesh Barbati
Post by s***@roguewave.com
You are correct that according to the standard peek() is not supposed
to set eofbit.
Could you please elaborate on that? My understanding is exactly the
"If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then
the input function, except as explicitly noted otherwise, completes its
actions and does setstate(eofbit), which may throw ios_base::failure
(27.4.4.3), before returning."
As peek() is implemented through sgetc() (27.6.1.3/23) and it's not
"explicitly noted otherwise", I interpret this as a clear requirement
that peek() shall effectively set eofbit on EOF.
You're correct. Library issue 60 (incorporated into C++ 2003)
(http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#60)
made peek() an unformatted input function and thus subject to the
blanket requirement to set eofbit on end-of-file. Prior to issue
60 peek() was neither a formatted or an unformatted input function
so the blanket requirement didn't apply. That, incidentally, may
have been deliberate for compatibility with Classic Iostreams
which doesn't set eofbit after a peek() at EOF.

In any case, sorry for muddying the waters with all this. My
point was that the handling of error conditions in iostreams
is a messy area in general and should be cleaned up and made
consistent.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
c***@gmail.com
2007-05-28 13:35:52 UTC
Permalink
Thank you all for your explanations.

To answer Seungbeom Kim, I went with what seemed the most natural to
me - since I don't use iostreams a lot, I did like it used to be done
in plain-old-C and tested EOF before reading. I see that there much
better ways to do it, and so I'll use that instead in the future. I'm
afraid that Sebor gave me undeserved credit, because my "should not
change anything" was merely reflecting a vague understanding of
things.

Learning a little every day...

Carl
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Old Wolf
2007-05-29 02:47:43 UTC
Permalink
Post by c***@gmail.com
To answer Seungbeom Kim, I went with what seemed the most natural to
me - since I don't use iostreams a lot, I did like it used to be done
in plain-old-C and tested EOF before reading. I see that there much
better ways to do it, and so I'll use that instead in the future.
In C, eof is not set until a read operation fails
either. If you open an empty file, feof(fp) returns
FALSE.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
c***@gmail.com
2007-05-29 15:11:15 UTC
Permalink
All right, I don't do much file-reading either... :-)

This is a good occasion to learn more on the topic. Is there a good
book (I mean really good) that extensively discusses the use of
iostreams for file access?

Carl
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
red floyd
2007-05-29 19:46:41 UTC
Permalink
Post by c***@gmail.com
All right, I don't do much file-reading either... :-)
This is a good occasion to learn more on the topic. Is there a good
book (I mean really good) that extensively discusses the use of
iostreams for file access?
There's always Josuttis ("The C++ Standard Library"), and there's also
Langer and Kreft ("Standard C++ IOStreams and Locales").
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Bo Persson
2007-05-29 22:12:59 UTC
Permalink
Old Wolf wrote:
:: On May 29, 1:35 am, ***@gmail.com wrote:
::: To answer Seungbeom Kim, I went with what seemed the most natural
::: to me - since I don't use iostreams a lot, I did like it used to
::: be done in plain-old-C and tested EOF before reading. I see that
::: there much better ways to do it, and so I'll use that instead in
::: the future.
::
:: In C, eof is not set until a read operation fails
:: either. If you open an empty file, feof(fp) returns
:: FALSE.

It just has to! :-)

I always imagine an input stream connected to the keyboard. How is the
stream to know if I intend to type some more characters, or not? What
should eof mean in that case?!


Bo Persson
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Loading...