Thinking In C++
Volume 2: Practical Programming
Bruce Eckel, President, MindView, Inc.
Chuck Allison, Utah Valley State College
“I’d
like to congratulate the both of you for a very impressive work! Not only did I
find your book to be an enjoyable and rewarding read … I was astounded by the
accuracy both in terms of technical correctness and use of the language … I
believe that you have attained a level of craftsmanship that is simply
outstanding.”
Bjorn Karlsson
Editorial Board, C/C++ Users Journal
“This book is a tremendous
achievement. You owe it to yourself to have a copy on your shelf.”
Al Stevens
Contributing Editor, Doctor Dobbs Journal
“Eckel’s book is the only one
to so clearly explain how to rethink program construction for object
orientation. That the book is also an excellent tutorial on the ins and outs of
C++ is an added bonus.”
Andrew Binstock
Editor, Unix Review
“Bruce continues to amaze me
with his insight into C++, and Thinking in C++ is his best collection of
ideas yet. If you want clear answers to difficult questions about C++, buy this
outstanding book.”
Gary Entsminger
Author, The Tao of Objects
“Thinking in C++ patiently
and methodically explores the issues of when and how to use inlines,
references, operator overloading, inheritance and dynamic objects, as well as
advanced topics such as the proper use of templates, exceptions and multiple
inheritance. The entire effort is woven in a fabric that includes Eckel’s own
philosophy of object and program design. A must for every C++ developer’s
bookshelf, Thinking in C++ is the one C++ book you must have if you’re
doing serious development with C++.”
Richard Hale Shaw
Contributing Editor, PC Magazine
CIP DATA AVAILABLE
Vice President and Editorial Director,
ECS: Marcia J. Horton
Publisher: Alan R. Apt
Associate Editor: Toni Dianne Holm
Editorial Assistant: Patrick Lindner
Vice President and Director of Production
and Manufacturing, ESM: David W. Riccardi
Executive Managing Editor: Vince O’Brien
Managing Editor: Camille Trentacoste
Production Editor: Irwin Zucker
Director of Creative Services: Paul
Belfanti
Creative Director: Carole Anson
Cover and Interior Designer: Daniel
Will-Harris
Cover Illustrations: Tina Jensen
Manufacturing Manager: Trudy Pisciotti
Manufacturing Buyer: Lisa McDowell
Marketing Manager: Pamela Shaffer
©2004 MindView, Inc.
Published by Pearson Prentice Hall
Pearson Education, Inc.
Upper Saddle River, NJ 07458
All rights reserved. No part of this book may be
reproduced in any form or by any means, without permission in writing from the
publisher.
Pearson Prentice Hall® is a trademark of Pearson
Education, Inc.
The authors and publisher of this book have used their
best efforts in preparing this book. These efforts include the development,
research, and testing of the theories and programs to determine their
effectiveness. The authors and publisher make no warranty of any kind,
expressed or implied, with regard to these programs or the documentation
contained in this book. The authors and publisher shall not be liable in any
event for incidental or consequential damages in connection with, or arising
out of, the furnishing, performance, or use of these programs.
Printed in the United States of America
10
9 8 7 6 5 4 3 2 1
ISBN
0-13-035313-2
Pearson Education Ltd., London
Pearson Education Australia Pty. Ltd., Sydney
Pearson Education Singapore, Pte. Ltd.
Pearson Education North Asia Ltd., Hong
Kong
Pearson Education Canada, Inc., Toronto
Pearson Educación de Mexico, S.A. de C.V.
Pearson Education-Japan, Tokyo
Pearson Education Malaysia, Pte. Ltd.
Pearson Education, Inc., Upper Saddle
River, New Jersey
Dedication
To all those who have
worked tirelessly
to develop the C++ language
In Volume 1 of this book, you learned the fundamentals of C and
C++. In this volume, we look at more advanced features, with an eye towards
developing techniques and ideas that produce robust C++ programs.
We assume you are familiar with the material presented in
Volume 1.
Our goals in this book are to:
1. Present the material a simple step at a time, so the reader can
easily digest each concept before moving on.
2. Teach “practical programming” techniques that you can use on a
day-to-day basis.
3. Give you what we think is important for you to understand about
the language, rather than everything we know. We believe there is an
“information importance hierarchy,” and there are some facts that 95% of
programmers will never need to know, but that would just confuse people and add
to their perception of the complexity of the language. To take an example from
C, if you memorize the operator precedence table (we never did) you can write
clever code. But if you must think about it, it will confuse the
reader/maintainer of that code. So forget about precedence and use parentheses
when things aren’t clear. This same attitude will be taken with some
information in the C++ language, which is more important for compiler writers
than for programmers.
4. Keep each section focused enough so the lecture time—and the time
between exercise periods—is small. Not only does this keep the audience’ minds
more active and involved during a hands-on seminar, but it gives the reader a
greater sense of accomplishment.
5. We have endeavored not to use any particular vendor’s version of
C++. We have tested the code on all the implementations we could (described
later in this introduction), and when one implementation absolutely refused to
work because it doesn’t conform to the C++ Standard, we’ve flagged that fact in
the example (you’ll see the flags in the source code) to exclude it from the
build process.
6. Automate the compiling and testing of the code in the book. We
have discovered that code that isn’t compiled and tested is probably broken, so
in this volume we’ve instrumented the examples with test code. In addition, the
code that you can download from http://www.MindView.net has been extracted
directly from the text of the book using programs that automatically create
makefiles to compile and run the tests. This way we know that the code in the
book is correct.
Here is a brief description of the chapters contained in
this book:
Part 1: Building Stable Systems
1. Exception handling. Error handling has always been
a problem in programming. Even if you dutifully return error information or set
a flag, the function caller may simply ignore it. Exception handling is a
primary feature in C++ that solves this problem by allowing you to “throw” an
object out of your function when a critical error happens. You throw different
types of objects for different errors, and the function caller “catches” these
objects in separate error handling routines. If you throw an exception, it
cannot be ignored, so you can guarantee that something will happen in
response to your error. The decision to use exceptions affects code design in positive,
fundamental ways.
2. Defensive Programming. Many software problems can
be prevented. To program defensively is to craft code in such a way that bugs are
found and fixed early before they can damage in the field. Using assertions is
the single most important way to validate your code during development, while
at the same time leaving an executable documentation trail in your code that
reveals your thoughts while you wrote the code in the first place. Rigorously
test your code before you let out of your hands. An automated unit testing framework
is an indispensable tool for successful, everyday software development.
Part 2: The Standard C++ Library
3. Strings in Depth. The most common programming
activity is text processing. The C++ string class relieves the programmer from
memory management issues, while at the same time delivering a powerhouse of
text processing capability. C++ also supports the use of wide characters and
locales for internationalized applications.
4. Iostreams. One of the original C++ libraries—the
one that provides the essential I/O facility—is called iostreams. Iostreams is
intended to replace C’s stdio.h with an I/O library that is easier to
use, more flexible, and extensible—you can adapt it to work with your new
classes. This chapter teaches you how to make the best use of the existing
iostream library for standard I/O, file I/O, and in-memory formatting.
5. Templates in Depth. The distinguishing feature of
“modern C++” is the broad power of templates. Templates do more than just create
generic containers. They support development of robust, generic,
high-performance libraries. There is a lot to know about templates—they
constitute, as it were, a sub-language within the C++ language, and give the
programmer an impressive degree of control over the compilation process. It is
not an overstatement to say that templates have revolutionized C++ programming.
6. Generic Algorithms. Algorithms are at the
core of computing, and C++, through its template facility, supports an
impressive entourage of powerful, efficient, and easy-to-use generic
algorithms. The standard algorithms are also customizable through function
objects. This chapter looks at every algorithm in the library. (Chapters 6 and
7 cover that portion of the Standard C++ library commonly known as the Standard
Template Library, or STL.)
7. Generic Containers & Iterators. C++
supports all the common data structures in a type-safe manner. You never need
to worry about what such a container holds. The homogeneity of its objects is
guaranteed. Separating the traversing of a container from the container itself,
another accomplishment of templates, is made possible through iterators. This
ingenious arrangement allows a flexible application of algorithms to containers
using the simplest of designs.
Part 3: Special Topics
8. Runtime type identification. Runtime type
identification (RTTI) finds the exact type of an object when you only have a
pointer or reference to the base type. Normally, you’ll want to intentionally
ignore the exact type of an object and let the virtual function mechanism
implement the correct behavior for that type. But occasionally (like when
writing software tools such as debuggers) it is helpful to know the exact type
of an object—with this information, you can often perform a special-case
operation more efficiently. This chapter explains what RTTI is for and how to
use it.
9. Multiple inheritance. This sounds simple at first:
A new class is inherited from more than one existing class. However, you can
end up with ambiguities and multiple copies of base-class objects. That problem
is solved with virtual base classes, but the bigger issue remains: When do you
use it? Multiple inheritance is only essential when you need to manipulate an
object through more than one common base class. This chapter explains the
syntax for multiple inheritance and shows alternative approaches—in particular,
how templates solve one typical problem. Using multiple inheritance to repair a
“damaged” class interface is demonstrated as a valuable use of this feature.
10. Design Patterns. The most revolutionary advance
in programming since objects is the introduction of design patterns. A
design pattern is a language-independent codification of a solution to a common
programming problem, expressed in such a way that it can apply to many
contexts. Patterns such as Singleton, Factory Method, and Visitor now find
their way into daily discussions around the keyboard. This chapter shows how to
implement and use some of the more useful design patterns in C++.
11. Concurrent Programming. People have come to
expect responsive user interfaces that (seem to) process multiple tasks
simultaneously. Modern operating systems allow processes to have multiple
threads that share the process address space. Multithreaded programming
requires a different mindset, however, and comes with its own set of difficulties.
This chapter uses a freely available library (the ZThread library by Eric
Crahen of IBM) to show how to effectively manage multithreaded applications in
C++.
We have discovered that simple exercises are exceptionally
useful during a seminar to complete a student’s understanding. You’ll find a
set at the end of each chapter.
These are fairly simple, so they can be finished in a
reasonable amount of time in a classroom situation while the instructor
observes, making sure all the students are absorbing the material. Some
exercises are a bit more challenging to keep advanced students entertained.
They’re all designed to be solved in a short time and are only there to test
and polish your knowledge rather than present major challenges (presumably,
you’ll find those on your own—or more likely they’ll find you).
Solutions to exercises can be found in the electronic
document The C++ Annotated Solution Guide, Volume 2, available for a nominal
fee from http://www.MindView.net.
The source code for this book is copyrighted freeware,
distributed via the web site http://www.MindView.net. The copyright prevents
you from republishing the code in print media without permission.
In the starting directory where you unpack the code you will
find the following copyright notice:
//:! :CopyRight.txt
(c) 1995-2004 MindView, Inc. All rights reserved.
Source code file from the book
"Thinking in C++, 2nd Edition, Volume 2."
The following permissions are granted respecting the
computer source code, which is contained in this file:
Permission is granted to classroom educators to use
this
file as part of instructional materials prepared for
classes personally taught or supervised by the educator
who
uses this permission, provided that (a) the book
"Thinking
in C++" is cited as the origin on each page or
slide that
contains any part of this file, and (b) that you may
not
remove the above copyright legend nor this notice. This
permission extends to handouts, slides and other
presentation materials.
For purposes that do not include the publication or
presentation of educational or instructional materials,
permission also is granted to computer program
designers
and programmers, and to their employers and customers,
(a)
to use and modify this file for the purpose of creating
executable computer software, and (b) to distribute
resulting computer programs in binary form only,
provided
that (c) you may not remove the above copyright legend
nor
this notice from retained source code copies of this
file,
and (d) each copy distributed in binary form has
embedded
within it the above copyright notice.
Apart from the permissions granted above, the sole
authorized distribution point for additional copies of
this
file is http://www.MindView.net (and official mirror
sites)
where it is available, subject to the permissions and
restrictions set forth herein.
The following are clarifications of the limited
permissions
granted above:
1. You may not publish or distribute originals or
modified versions of the source code to the software
other
than in classroom situations described above.
2. You may not use the software file or portions
thereof in printed media without the express permission
of
the copyright owner.
The copyright owner and author or authors make no
representation about the suitability of this software
for
any purpose. It is provided "as is," and all
express,
implied, and statutory warranties and conditions of any
kind including any warranties and conditions of
merchantability, satisfactory quality, security,
fitness
for a particular purpose and non-infringement, are
disclaimed. The entire risk as to the quality and
performance of the software is with you.
In no event will the authors or the publisher be liable
for
any lost revenue, savings, or data, or for direct,
indirect, special, consequential, incidental, exemplary
or
punitive damages, however caused and regardless of any
related theory of liability, arising out of this
license
and/or the use of or inability to use this software,
even
if the vendors and/or the publisher have been advised
of
the possibility of such damages. Should the software
prove
defective, you assume the cost of all necessary
servicing,
repair, or correction.
If you think you have a correction for an error in the
software, please submit the correction to
www.MindView.net.
(Please use the same process for non-code errors found
in
the book.)
If you have a need for permissions not granted above,
please inquire of MindView, Inc., at www.MindView.net
or
send a request by email to Bruce@EckelObjects.com.
///:~
You may use the code in your projects and in the classroom
as long as the copyright notice is retained.
Your compiler may not support all the features discussed in
this book, especially if you don’t have the newest version of your compiler.
Implementing a language like C++ is a Herculean task, and you can expect that
the features will appear in pieces rather than all at once. But if you attempt
one of the examples in the book and get a lot of errors from the compiler, it’s
not necessarily a bug in the code or the compiler—it may simply not be
implemented in your particular compiler yet.
We used a number of compilers to test the code in this book,
in an attempt to ensure that our code conforms to the C++ Standard and will
work with as many compilers as possible. Unfortunately, not all compilers
conform to the C++ Standard, and so we have a way of excluding certain files
from building with those compilers. These exclusions are reflected in the
makefiles automatically created for the package of code for this book that you
can download from www.MindView.net. You can see the exclusion tags embedded in
the comments at the beginning of each listing, so you will know whether to
expect a particular compiler to work on that code (in a few cases, the compiler
will actually compile the code but the execution behavior is wrong, and we
exclude those as well).
Here are the tags and the compilers that they exclude from
the build:
· {-dmc} Walter Bright’s Digital Mars compiler for Windows,
freely downloadable at www.DigitalMars.com. This compiler is very conformant
and so you will see almost none of these tags throughout the book.
· {-g++} The free Gnu C++ 3.3.1, which comes pre-installed
in most Linux packages and Macintosh OSX. It is also part of Cygwin for Windows
(see below). It is available for most other platforms from gcc.gnu.org.
· {-msc} Microsoft Version 7 with Visual C++ .NET (only
comes with Visual Studio .NET; not freely downloadable).
· {-bor} Borland C++ Version 6 (not the free download; this
one is more up to date).
· {-edg} Edison Design Group (EDG) C++. This is the
benchmark compiler for standards conformance. This tag occurs only because of
library issues, and because we were using a complimentary copy of the EDG front
end with a complimentary library implementation from Dinkumware, Ltd. No
compile errors occurred because of the compiler alone.
· {-mwcc} Metrowerks Code Warrior for Macintosh OS X. Note
that OS X comes with Gnu C++ pre-installed, as well.
If you download and unpack the code package for this book
from www.MindView.net, you’ll find the makefiles to build the code for the
above compilers. We used the freely-available GNU-make, which comes with
Linux, Cygwin (a free Unix shell that runs on top of Windows; see
www.Cygwin.com), or can be installed on your platform—see
www.gnu.org/software/make. (Other makes may or may not work with these
files, but are not supported.) Once you install make, if you type make
at the command line you’ll get instructions on how to build the book’s code for
the above compilers.
Note that the placement of these tags on the files in this
book indicates the state of the particular version of the compiler at the time
we tried it. It’s possible and likely that the compiler vendor has improved the
compiler since the publication of this book. It’s also possible that while
building the book with so many compilers, we may have misconfigured a
particular compiler that would otherwise have compiled the code correctly. Thus,
you should try the code yourself on your compiler, and also check the code
downloaded from www.MindView.net to see what is current.
Throughout this book, when referring to conformance to the
ANSI/ISO C standard, we will be referring to the 1989 standard, and will
generally just say ‘C.’ Only if it is necessary to distinguish between
Standard 1989 C and older, pre-Standard versions of C will we make the
distinction. We do not reference C99 in this book.
The ANSI/ISO C++ Committee long ago finished working on the first C++ Standard, commonly known as C++98. We will use the term Standard
C++ to refer to this standardized language. If we simply refer to C++,
assume we mean “Standard C++.” The C++ Standards Committee continues to address
issues important to the C++ community that will become C++0x, a future C++
Standard not likely to be available for many years.
Seminars, CD–ROMs &
consulting
Bruce Eckel’s company, MindView, Inc., provides public
hands-on training seminars based on the material in this book, and also for
advanced topics. Selected material from each chapter represents a lesson, which
is followed by a monitored exercise period so each student receives personal
attention. We also provide on-site training, consulting, mentoring, and design
& code walkthroughs. Information and sign-up forms for upcoming seminars
and other contact information is found at http://www.MindView.net.
No matter how many tricks writers use to detect errors, some
always creep in and these often leap off the page for a fresh reader. If you
discover anything you believe to be an error, please use the feedback system
built into the electronic version of this book, which you will find at http://www.MindView.net.
Your help is appreciated.
The cover artwork was painted by Larry O’Brien’s wife, Tina
Jensen (yes, the Larry O’Brien who was the editor of Software Development
Magazine for so many years). Not only are the pictures beautiful, they are also
excellent suggestions of polymorphism. The idea for using these images came
from Daniel Will-Harris, the cover designer (www.Will-Harris.com), working with
Bruce.
Volume 2 of this book languished in a half-completed state
for a long time while Bruce got distracted with other things, notably Java,
Design Patterns and especially Python (see www.Python.org). If Chuck hadn’t
been willing (foolishly, he has sometimes thought) to finish the other half and
bring things up-to-date, this book almost certainly wouldn’t have happened.
There aren’t that many people whom Bruce would have felt comfortable entrusting
this book to. Chuck’s penchant for precision, correctness and clear explanation
is what has made this book as good as it is.
Jamie King acted as an intern under Chuck’s direction during
the completion of this book. He was an essential part of making sure the book
got finished, not only by providing feedback for Chuck, but especially because
of his relentless questioning and picking of every single possible nit that he
didn’t completely understand. If your questions are answered by this book, it’s
probably because Jamie asked them first. Jamie also enhanced a number of the
sample programs and created many of the exercises at the end of each chapter.
Scott Baker, another of Chuck’s interns funded by MindView, Inc., helped with
the exercises for Chapter 3.
Eric Crahen of IBM was instrumental in the completion of
Chapter 11 (Concurrency). When we were looking for a threads package, we sought
out one that was intuitive and easy to use, while being sufficiently robust to
do the job. With Eric we got that and then some—he was extremely cooperative
and has used our feedback to enhance his library, while we have benefited from
his insights as well.
We are grateful to Pete Becker for being our technical
editor. Few people are as articulate and discriminating as Pete, not to mention
as expert in C++ and software development in general. We also thank Bjorn
Karlsson for his gracious and timely technical assistance as he reviewed the
entire manuscript with short notice.
Walter Bright made Herculean efforts to make sure that his
Digital Mars C++ compiler would compile the examples in this book. He makes the
compiler available for free downloads at http://www.DigitalMars.com. Thanks, Walter!
The ideas and understanding in this book have come from many
other sources, as well: friends like Andrea Provaglio, Dan Saks, Scott Meyers,
Charles Petzold, and Michael Wilk; pioneers of the language like Bjarne
Stroustrup, Andrew Koenig, and Rob Murray; members of the C++ Standards
Committee like Nathan Myers (who was particularly helpful and generous with his
insights), Herb Sutter, PJ Plauger, Kevlin Henney, David Abrahams, Tom Plum,
Reg Charney, Tom Penello, Sam Druker, Uwe Steinmueller, John Spicer, Steve
Adamczyk, and Daveed Vandevoorde; people who have spoken in the C++ track at
the Software Development Conference (which Bruce created and developed, and
Chuck spoke in); Colleagues of Chuck like Michael Seaver, Huston Franklin,
David Wagstaff, and often students in seminars, who ask the questions we need
to hear to make the material clearer.
The book design, typeface selection, cover design, and cover
photo were created by Bruce’s friend Daniel Will-Harris, noted author and
designer, who used to play with rub-on letters in junior high school while he
awaited the invention of computers and desktop publishing. However, we produced
the camera-ready pages ourselves, so the typesetting errors are ours. Microsoft®
Word XP was used to write the book and to create camera-ready pages. The body
typeface is Verdana and the headlines are in Verdana. The code type face is
Courier New.
We also wish to thank the
generous professionals at the Edison Design Group and Dinkumware, Ltd., for
giving us complimentary copies of their compiler and library (respectively).
Without their expert assistance, graciously given, some of the examples in this
book could not have been tested. We also wish to thank Howard Hinnant and the
folks at Metrowerks for a copy of their compiler, and Sandy Smith and the folks
at SlickEdit for keeping Chuck supplied with a world-class editing environment
for so many years. Greg Comeau also provided a copy of his successful EDG-based
compiler, Comeau C++.
A special thanks to all
our teachers, and all our students (who are our teachers as well).
Evan Cofsky
(Evan@TheUnixMan.com) provided all sorts of assistance on the server as well as
development of programs in his now-favorite language, Python. Sharlynn Cobaugh
and Paula Steuer were instrumental assistants, preventing Bruce from being
washed away in a flood of projects.
Bruce’s sweetie Dawn McGee provided much-appreciated
inspiration and enthusiasm during this project. The supporting cast of friends
includes, but is not limited to: Mark Western, Gen Kiyooka, Kraig Brockschmidt,
Zack Urlocker, Andrew Binstock, Neil Rubenking, Steve Sinofsky, JD Hildebrandt,
Brian McElhinney, Brinkley Barr, Bill Gates at Midnight Engineering Magazine,
Larry Constantine & Lucy Lockwood, Tom Keffer, Greg Perry, Dan Putterman,
Christi Westphal, Gene Wang, Dave Mayer, David Intersimone, Claire Sawyers, The
Italians (Andrea Provaglio, Laura Fallai, Marco Cantu, Corrado, Ilsa and
Christina Giustozzi), Chris & Laura Strand, The Almquists, Brad Jerbic,
John Kruth & Marilyn Cvitanic, Holly Payne (yes, the famous novelist!),
Mark Mabry, The Robbins Families, The Moelter Families (& the McMillans),
The Wilks, Dave Stoner, Laurie Adams, The Cranstons, Larry Fogg, Mike &
Karen Sequeira, Gary Entsminger & Allison Brody, Chester Andersen, Joe
Lordi, Dave & Brenda Bartlett, The Rentschlers, The Sudeks, Lynn &
Todd, and their families. And of course, Mom & Dad, Sandy, James &
Natalie, Kim& Jared, Isaac, and Abbi.
Software engineers spend about as much time validating code as
they do creating it. Quality is or should be the goal of every programmer, and
one can go a long way towards that goal by eliminating problems before they happen.
In addition, software systems should be robust enough to behave reasonably in
the presence of unforeseen environmental problems.
Exceptions were introduced into C++ to support sophisticated
error handling without cluttering code with an inordinate amount of
error-handling logic. Chapter 1 shows how proper use of exceptions can make for
well-behaved software, and also introduces the design principles that underlie
exception-safe code. In Chapter 2 we cover unit testing and debugging
techniques intended to maximize code quality long before it’s released. The use
of assertions to express and enforce program invariants is a sure sign of an
experienced software engineer. We also introduce a simple framework to support
unit testing.
Improving error recovery is one of the most powerful ways you can increase the robustness of your code.
Unfortunately, it’s almost accepted practice to ignore error
conditions, as if we’re in a state of denial about errors. One reason, no
doubt, is the tediousness and code bloat of checking for many errors. For
example, printf( ) returns the number of characters that were
successfully printed, but virtually no one checks this value. The proliferation
of code alone would be disgusting, not to mention the difficulty it would add
in reading the code.
The problem with C’s approach to error handling could be
thought of as coupling—the user of a function must tie the error-handling code
so closely to that function that it becomes too ungainly and awkward to use.
One of the major features in C++ is exception handling,
which is a better way of thinking about and handling errors. With exception handling:
1. Error-handling code is not nearly so tedious to write, and it
doesn’t become mixed up with your “normal” code. You write the code you want
to happen; later in a separate section you write the code to cope with the
problems. If you make multiple calls to a function, you handle the errors from
that function once, in one place.
2. Errors cannot be ignored. If a function needs to send an error
message to the caller of that function, it “throws” an object representing that
error out of the function. If the caller doesn’t “catch” the error and handle
it, it goes to the next enclosing dynamic scope, and so on until the error is
either caught or the program terminates because there was no handler to catch
that type of exception.
This chapter examines C’s approach to error handling (such as it is), discusses why it did not work well for C, and explains why it won’t work at all
for C++. This chapter also covers try, throw, and catch,
the C++ keywords that support exception handling.
In most of the examples in these volumes, we use assert( )
as it was intended: for debugging during development with code that can be
disabled with #define NDEBUG for the shipping product. Runtime
error checking uses the require.h functions (assure( ) and require( ))
developed in Chapter 9 in Volume 1 and repeated here in Appendix B. These
functions are a convenient way to say, “There’s a problem here you’ll probably
want to handle with some more sophisticated code, but you don’t need to be
distracted by it in this example.” The require.h functions might be
enough for small programs, but for complicated products you’ll want to write
more sophisticated error-handling code.
Error handling is quite straightforward when you know
exactly what to do, because you have all the necessary information in that
context. You can just handle the error at that point.
The problem occurs when you don’t have enough
information in that context, and you need to pass the error information into a
different context where that information does exist. In C, you can handle this
situation using three approaches:
1. Return error information from the function or, if the return
value cannot be used this way, set a global error condition flag. (Standard C
provides errno and perror( ) to support this.) As mentioned
earlier, the programmer is likely to ignore the error information because
tedious and obfuscating error checking must occur with each function call. In
addition, returning from a function that hits an exceptional condition might
not make sense.
2. Use the little-known Standard C library signal-handling system,
implemented with the signal( ) function (to determine what happens
when the event occurs) and raise( ) (to generate an event). Again,
this approach involves high coupling because it requires the user of any
library that generates signals to understand and install the appropriate
signal-handling mechanism. In large projects the signal numbers from different
libraries might clash.
3. Use the nonlocal goto functions in the Standard C library:
setjmp( ) and longjmp( ). With setjmp( )
you save a known good state in the program, and if you get into trouble, longjmp( )
will restore that state. Again, there is high coupling between the place where
the state is stored and the place where the error occurs.
When considering error-handling schemes with C++, there’s an
additional critical problem: The C techniques of signals and setjmp( )/longjmp( )
do not call destructors, so objects aren’t properly cleaned up. (In fact, if longjmp( )
jumps past the end of a scope where destructors should be called, the behavior
of the program is undefined.) This makes it virtually impossible to effectively
recover from an exceptional condition because you’ll always leave objects
behind that haven’t been cleaned up and that can no longer be accessed. The
following example demonstrates this with setjmp/longjmp:
//: C01:Nonlocal.cpp
// setjmp() & longjmp().
#include <iostream>
#include <csetjmp>
using namespace std;
class Rainbow {
public:
Rainbow() { cout << "Rainbow()"
<< endl; }
~Rainbow() { cout << "~Rainbow()"
<< endl; }
};
jmp_buf kansas;
void oz() {
Rainbow rb;
for(int i = 0; i < 3; i++)
cout << "there's no place like
home" << endl;
longjmp(kansas, 47);
}
int main() {
if(setjmp(kansas) == 0) {
cout << "tornado, witch,
munchkins..." << endl;
oz();
} else {
cout << "Auntie Em! "
<< "I had the strangest
dream..."
<< endl;
}
} ///:~
The setjmp( ) function is odd because if you
call it directly, it stores all the relevant information about the current
processor state (such as the contents of the instruction pointer and runtime
stack pointer) in the jmp_buf and returns zero. In this case it behaves
like an ordinary function. However, if you call longjmp( ) using
the same jmp_buf, it’s as if you’re returning from setjmp( )
again—you pop right out the back end of the setjmp( ). This time,
the value returned is the second argument to longjmp( ), so you can
detect that you’re actually coming back from a longjmp( ). You can
imagine that with many different jmp_bufs, you could pop around to many
different places in the program. The difference between a local goto
(with a label) and this nonlocal goto is that you can return to any
pre-determined location higher up in the runtime stack with setjmp( )/longjmp( )
(wherever you’ve placed a call to setjmp( )).
The problem in C++ is that longjmp( ) doesn’t
respect objects; in particular it doesn’t call destructors when it jumps out of
a scope. Destructor
calls are essential, so this approach won’t work with C++. In fact, the C++
Standard states that branching into a scope with goto (effectively
bypassing constructor calls), or branching out of a scope with longjmp( )
where an object on the stack has a destructor, constitutes undefined behavior.
If you encounter an exceptional situation in your code—that
is, if you don’t have enough information in the current context to decide what
to do—you can send information about the error into a larger context by
creating an object that contains that information and “throwing” it out of your
current context. This is called throwing an exception. Here’s what it
looks like:
//: C01:MyError.cpp {RunByHand}
class MyError {
const char* const data;
public:
MyError(const char* const msg = 0) : data(msg) {}
};
void f() {
// Here we "throw" an exception object:
throw MyError("something bad happened");
}
int main() {
// As you’ll see shortly, we’ll want a "try
block" here:
f();
} ///:~
MyError is an ordinary class, which in this case
takes a char* as a constructor argument. You can use any type when you
throw (including built-in types), but usually you’ll create special classes for
throwing exceptions.
The keyword throw causes a number of relatively
magical things to happen. First, it creates a copy of the object you’re
throwing and, in effect, “returns” it from the function containing the throw
expression, even though that object type isn’t normally what the function is
designed to return. A naive way to think about exception handling is as an
alternate return mechanism (although you’ll find you can get into trouble if
you take that analogy too far). You can also exit from ordinary scopes by
throwing an exception. In any case, a value is returned, and the function or
scope exits.
Any similarity to a return statement ends there
because where you return is some place completely different from where a
normal function call returns. (You end up in an appropriate part of the
code—called an exception handler—that might be far removed from where the
exception was thrown.) In addition, any local objects created by the time the
exception occurs are destroyed. This automatic cleanup of local objects is
often called “stack unwinding.”
In addition, you can throw as many different types of
objects as you want. Typically, you’ll throw a different type for each category
of error. The idea is to store the information in the object and in the name
of its class so that someone in a calling context can figure out what to do
with your exception.
As mentioned earlier, one of the advantages of C++ exception
handling is that you can concentrate on the problem you’re trying to solve in
one place, and then deal with the errors from that code in another place.
If you’re inside a function and you throw an exception (or a
called function throws an exception), the function exits because of the thrown
exception. If you don’t want a throw to leave a function, you can set up
a special block within the function where you try to solve your actual
programming problem (and potentially generate exceptions). This block is called
the try block because you try your various function calls there.
The try block is an ordinary scope, preceded by the keyword try:
try {
// Code that may generate exceptions
}
If you check for errors by carefully examining the return
codes from the functions you use, you need to surround every function call with
setup and test code, even if you call the same function several times. With
exception handling, you put everything in a try block and handle
exceptions after the try block. Thus, your code is a lot easier to write
and to read because the goal of the code is not confused with the error handling.
Of course, the thrown exception must end up some place. This
place is the exception handler, and you need one exception handler for every exception type you want to catch. However, polymorphism also works for
exceptions, so one exception handler can work with an exception type and
classes derived from that type.
Exception handlers immediately follow the try block
and are denoted by the keyword catch:
try {
// Code that may generate exceptions
} catch(type1 id1) {
// Handle exceptions of type1
} catch(type2 id2) {
// Handle exceptions of type2
} catch(type3 id3)
// Etc...
} catch(typeN idN)
// Handle exceptions of typeN
}
// Normal execution resumes here...
The syntax of a catch clause resembles functions that
take a single argument. The identifier (id1, id2, and so on) can
be used inside the handler, just like a function argument, although you can
omit the identifier if it’s not needed in the handler. The exception type
usually gives you enough information to deal with it.
The handlers must appear directly after the try
block. If an exception is thrown, the exception-handling mechanism goes hunting
for the first handler with an argument that matches the type of the exception.
It then enters that catch clause, and the exception is considered
handled. (The search for handlers stops once the catch clause is found.)
Only the matching catch clause executes; control then resumes after the
last handler associated with that try block.
Notice that, within the try block, a number of
different function calls might generate the same type of exception, but you
need only one handler.
To illustrate try and catch, the following
variation of Nonlocal.cpp replaces the call to setjmp( )
with a try block and replaces the call to longjmp( ) with a throw
statement:
//: C01:Nonlocal2.cpp
// Illustrates exceptions.
#include <iostream>
using namespace std;
class Rainbow {
public:
Rainbow() { cout << "Rainbow()"
<< endl; }
~Rainbow() { cout << "~Rainbow()"
<< endl; }
};
void oz() {
Rainbow rb;
for(int i = 0; i < 3; i++)
cout << "there's no place like
home" << endl;
throw 47;
}
int main() {
try {
cout << "tornado, witch, munchkins..."
<< endl;
oz();
} catch(int) {
cout << "Auntie Em! I had the strangest
dream..."
<< endl;
}
} ///:~
When the throw statement in oz( )
executes, program control backtracks until it finds the catch clause
that takes an int parameter. Execution resumes with the body of that catch
clause. The most important difference between this program and Nonlocal.cpp
is that the destructor for the object rb is called when the throw
statement causes execution to leave the function oz( ).
There are two basic models in exception-handling theory: termination and resumption. In termination (which is what C++
supports), you assume the error is so critical that there’s no way to
automatically resume execution at the point where the exception occurred. In
other words, whoever threw the exception decided there was no way to salvage
the situation, and they don’t want to come back.
The alternative error-handling model is called resumption,
first introduced with the PL/I language in the 1960s. Using
resumption semantics means that the exception handler is expected to do
something to rectify the situation, and then the faulting code is automatically
retried, presuming success the second time. If you want resumption in C++, you
must explicitly transfer execution back to the code where the error occurred,
usually by repeating the function call that sent you there in the first place.
It is not unusual to place your try block inside a while loop
that keeps reentering the try block until the result is satisfactory.
Historically, programmers using operating systems that
supported resumptive exception handling eventually ended up using
termination-like code and skipping resumption. Although resumption sounds
attractive at first, it seems it isn’t quite so useful in practice. One reason
may be the distance that can occur between the exception and its handler. It is
one thing to terminate to a handler that’s far away, but to jump to that
handler and then back again may be too conceptually difficult for large systems
where the exception is generated from many points.
When an exception is thrown, the exception-handling system
looks through the “nearest” handlers in the order they appear in the source
code. When it finds a match, the exception is considered handled and no further
searching occurs.
Matching an exception doesn’t require a perfect correlation
between the exception and its handler. An object or reference to a
derived-class object will match a handler for the base class. (However, if the
handler is for an object rather than a reference, the exception object is
“sliced”—truncated to the base type—as it is passed to the handler. This does no damage, but loses all the derived-type information.) For this reason, as
well as to avoid making yet another copy of the exception object, it is always better to catch an exception by reference instead of by value. If
a pointer is thrown, the usual standard pointer conversions are used to match
the exception. However, no automatic type conversions are used to convert from one exception type to another in the process of matching. For example:
//: C01:Autoexcp.cpp
// No matching conversions.
#include <iostream>
using namespace std;
class Except1 {};
class Except2 {
public:
Except2(const Except1&) {}
};
void f() { throw Except1(); }
int main() {
try { f();
} catch(Except2&) {
cout << "inside catch(Except2)"
<< endl;
} catch(Except1&) {
cout << "inside catch(Except1)"
<< endl;
}
} ///:~
Even though you might think the first handler could be matched
by converting an Except1 object into an Except2 using the converting
constructor, the system will not perform such a conversion during exception
handling, and you’ll end up at the Except1 handler.
The following example shows how a base-class handler can
catch a derived-class exception:
//: C01:Basexcpt.cpp
// Exception hierarchies.
#include <iostream>
using namespace std;
class X {
public:
class Trouble {};
class Small : public Trouble {};
class Big : public Trouble {};
void f() { throw Big(); }
};
int main() {
X x;
try {
x.f();
} catch(X::Trouble&) {
cout << "caught Trouble" <<
endl;
// Hidden by previous handler:
} catch(X::Small&) {
cout << "caught Small Trouble"
<< endl;
} catch(X::Big&) {
cout << "caught Big Trouble"
<< endl;
}
} ///:~
Here, the exception-handling mechanism will always match a Trouble
object, or anything that is a Trouble (through public
inheritance), to
the first handler. That means the second and third handlers are never called
because the first one captures them all. It makes more sense to catch the
derived types first and put the base type at the end to catch anything less
specific.
Notice that these examples catch exceptions by reference,
although for these classes it isn’t important because there are no additional
members in the derived classes, and there are no argument identifiers in the
handlers anyway. You’ll usually want to use reference arguments rather than
value arguments in your handlers to avoid slicing off information.
Sometimes you want to create a handler that catches any
type of exception. You do this using the ellipsis in the argument list:
catch(...) {
cout << "an exception was thrown"
<< endl;
}
Because an ellipsis catches any exception, you’ll want to
put it at the end of your list of handlers to avoid pre-empting any that
follow it.
The ellipsis gives you no possibility to have an argument, so
you can’t know anything about the exception or its type. It’s a “catchall.”
Such a catch clause is often used to clean up some resources and then
rethrow the exception.
You usually want to rethrow an exception when you have some
resource that needs to be released, such as a network connection or heap memory
that needs to be deallocated. (See the section “Resource Management” later in
this chapter for more detail). If an exception occurs, you don’t necessarily
care what error caused the exception—you just want to close the connection you
opened previously. After that, you’ll want to let some other context closer to
the user (that is, higher up in the call chain) handle the exception. In this
case the ellipsis specification is just what you want. You want to catch any
exception, clean up your resource, and then rethrow the exception for handling
elsewhere. You rethrow an exception by using throw with no argument
inside a handler:
catch(...) {
cout << "an exception was
thrown" << endl;
// Deallocate your resource here,
and then rethrow
throw;
}
Any further catch clauses for the same try
block are still ignored—the throw causes the exception to go to the
exception handlers in the next-higher context. In addition, everything about
the exception object is preserved, so the handler at the higher context that
catches the specific exception type can extract any information the object may
contain.
As we explained in the beginning of this chapter, exception
handling is considered better than the traditional return-an-error-code
technique because exceptions can’t be ignored, and because the error handling
logic is separated from the problem at hand. If none of the exception handlers following a particular try block matches an exception, that exception
moves to the next-higher context, that is, the function or try block
surrounding the try block that did not catch the exception. (The
location of this try block is not always obvious at first glance, since
it’s higher up in the call chain.) This process continues until, at some level,
a handler matches the exception. At that point, the exception is considered “caught,”
and no further searching occurs.
The terminate( ) function
If no handler at any level catches the exception, the
special library function terminate( ) (declared in the <exception>
header) is automatically called. By default, terminate( ) calls the
Standard C library function abort( ) , which abruptly exits the
program. On Unix systems, abort( ) also causes a core dump. When abort( )
is called, no calls to normal program termination functions occur, which means
that destructors for global and static objects do not execute. The terminate( )
function also executes if a destructor for a local object throws an exception while
the stack is unwinding (interrupting the exception that was in progress) or if
a global or static object’s constructor or destructor throws an exception. (In
general, do not allow a destructor to throw an exception.)
The set_terminate( ) function
You can install your own terminate( ) function
using the standard set_terminate( ) function, which returns a
pointer to the terminate( ) function you are replacing (which will
be the default library version the first time you call it), so you can restore
it later if you want. Your custom terminate( ) must take no
arguments and have a void return value. In addition, any terminate( )
handler you install must not return or throw an exception, but instead must
execute some sort of program-termination logic. If terminate( ) is
called, the problem is unrecoverable.
The following example shows the use of set_terminate( ).
Here, the return value is saved and restored so that the terminate( )
function can be used to help isolate the section of code where the uncaught
exception occurs:
//: C01:Terminator.cpp
// Use of set_terminate(). Also shows uncaught
exceptions.
#include <exception>
#include <iostream>
using namespace std;
void terminator() {
cout << "I'll be back!" <<
endl;
exit(0);
}
void (*old_terminate)() = set_terminate(terminator);
class Botch {
public:
class Fruit {};
void f() {
cout << "Botch::f()" << endl;
throw Fruit();
}
~Botch() { throw 'c'; }
};
int main() {
try {
Botch b;
b.f();
} catch(...) {
cout << "inside catch(...)"
<< endl;
}
} ///:~
The definition of old_terminate looks a bit confusing
at first: it not only creates a pointer to a function, but it initializes that
pointer to the return value of set_terminate( ). Even though you
might be familiar with seeing a semicolon right after a pointer-to-function
declaration, here it’s just another kind of variable and can be initialized
when it is defined.
The class Botch not only throws an exception inside f( ),
but also in its destructor. This causes a call to terminate( ), as
you can see in main( ). Even though the exception handler says catch(...),
which would seem to catch everything and leave no cause for terminate( )
to be called, terminate( ) is called anyway. In the process of
cleaning up the objects on the stack to handle one exception, the Botch
destructor is called, and that generates a second exception, forcing a call to terminate( ).
Thus, a destructor that throws an exception or causes one to be thrown is
usually a sign of poor design or sloppy coding.
Part of the magic of exception handling is that you can pop from normal program flow into the appropriate exception handler. Doing so
wouldn’t be useful, however, if things weren’t cleaned up properly as the
exception was thrown. C++ exception handling guarantees that as you leave a
scope, all objects in that scope whose constructors have been completed
will have their destructors called.
Here’s an example that demonstrates that constructors that aren’t completed don’t have the associated destructors called. It also shows
what happens when an exception is thrown in the middle of the creation of an
array of objects:
//: C01:Cleanup.cpp
// Exceptions clean up complete objects only.
#include <iostream>
using namespace std;
class Trace {
static int counter;
int objid;
public:
Trace() {
objid = counter++;
cout << "constructing Trace #"
<< objid << endl;
if(objid == 3) throw 3;
}
~Trace() {
cout << "destructing Trace #"
<< objid << endl;
}
};
int Trace::counter = 0;
int main() {
try {
Trace n1;
// Throws exception:
Trace array[5];
Trace n2; // Won't get here.
} catch(int i) {
cout << "caught " << i
<< endl;
}
} ///:~
The class Trace keeps track of objects so that you
can trace program progress. It keeps a count of the number of objects created
with a static data member counter and tracks the number of the
particular object with objid.
The main program creates a single object, n1 (objid
0), and then attempts to create an array of five Trace objects, but an
exception is thrown before the fourth object (#3) is fully created. The object n2
is never created. You can see the results in the output of the program:
constructing Trace #0
constructing Trace #1
constructing Trace #2
constructing Trace #3
destructing Trace #2
destructing Trace #1
destructing Trace #0
caught 3
Three array elements are successfully created, but in the
middle of the constructor for the fourth element, an exception is thrown.
Because the fourth construction in main( ) (for array[2])
never completes, only the destructors for objects array[1] and array[0]
are called. Finally, object n1 is destroyed, but not object n2,
because it was never created.
When writing code with exceptions, it’s particularly
important that you always ask, “If an exception occurs, will my resources be
properly cleaned up?” Most of the time you’re fairly safe, but in constructors
there’s a particular problem: if an exception is thrown before a constructor is
completed, the associated destructor will not be called for that object. Thus,
you must be especially diligent while writing your constructor.
The difficulty is in allocating resources in constructors.
If an exception occurs in the constructor, the destructor doesn’t get a chance
to deallocate the resource. This problem occurs most often with “naked”
pointers. For example:
//: C01:Rawp.cpp
// Naked pointers.
#include <iostream>
#include <cstddef>
using namespace std;
class Cat {
public:
Cat() { cout << "Cat()" <<
endl; }
~Cat() { cout << "~Cat()" <<
endl; }
};
class Dog {
public:
void* operator new(size_t sz) {
cout << "allocating a Dog" <<
endl;
throw 47;
}
void operator delete(void* p) {
cout << "deallocating a Dog"
<< endl;
::operator delete(p);
}
};
class UseResources {
Cat* bp;
Dog* op;
public:
UseResources(int count = 1) {
cout << "UseResources()" <<
endl;
bp = new Cat[count];
op = new Dog;
}
~UseResources() {
cout << "~UseResources()" <<
endl;
delete [] bp; // Array delete
delete op;
}
};
int main() {
try {
UseResources ur(3);
} catch(int) {
cout << "inside handler" <<
endl;
}
} ///:~
The output is
UseResources()
Cat()
Cat()
Cat()
allocating a Dog
inside handler
The UseResources constructor is entered, and the Cat
constructor is successfully completed for the three array objects. However,
inside Dog::operator new( ), an exception is thrown (to simulate an
out-of-memory error). Suddenly, you end up inside the handler, without
the UseResources destructor being called. This is correct because the UseResources
constructor was unable to finish, but it also means the Cat objects that
were successfully created on the heap were never destroyed.
To prevent such resource leaks, you must guard against these
“raw” resource allocations in one of two ways:
· You can catch exceptions inside the constructor and then release
the resource.
· You can place the allocations inside an object’s constructor, and
you can place the deallocations inside an object’s destructor.
Using the latter approach, each allocation becomes atomic, by virtue of being part of the lifetime of a local object, and if it fails, the
other resource allocation objects are properly cleaned up during stack
unwinding. This technique is called Resource Acquisition Is Initialization (RAII for short) because it equates resource control with object lifetime.
Using templates is an excellent way to modify the previous example to achieve
this:
//: C01:Wrapped.cpp
// Safe, atomic pointers.
#include <iostream>
#include <cstddef>
using namespace std;
// Simplified. Yours may have other arguments.
template<class T, int sz = 1> class PWrap {
T* ptr;
public:
class RangeError {}; // Exception class
PWrap() {
ptr = new T[sz];
cout << "PWrap constructor"
<< endl;
}
~PWrap() {
delete[] ptr;
cout << "PWrap destructor" <<
endl;
}
T& operator[](int i) throw(RangeError) {
if(i >= 0 && i < sz) return ptr[i];
throw RangeError();
}
};
class Cat {
public:
Cat() { cout << "Cat()" <<
endl; }
~Cat() { cout << "~Cat()" <<
endl; }
void g() {}
};
class Dog {
public:
void* operator new[](size_t) {
cout << "Allocating a Dog" <<
endl;
throw 47;
}
void operator delete[](void* p) {
cout << "Deallocating a Dog"
<< endl;
::operator delete[](p);
}
};
class UseResources {
PWrap<Cat, 3> cats;
PWrap<Dog> dog;
public:
UseResources() { cout <<
"UseResources()" << endl; }
~UseResources() { cout <<
"~UseResources()" << endl; }
void f() { cats[1].g(); }
};
int main() {
try {
UseResources ur;
} catch(int) {
cout << "inside handler" <<
endl;
} catch(...) {
cout << "inside catch(...)"
<< endl;
}
} ///:~
The difference is the use of the template to wrap the
pointers and make them into objects. The constructors for these objects are
called before the body of the UseResources constructor, and any
of these constructors that complete before an exception is thrown will have
their associated destructors called during stack unwinding.
The PWrap template shows a more typical use of
exceptions than you’ve seen so far: A nested class called RangeError is
created to use in operator[ ] if its argument is out of range.
Because operator[ ] returns a reference, it cannot return zero. (There are no null references.) This is a true exceptional condition—you don’t know
what to do in the current context and you can’t return an improbable value. In
this example, RangeError is
simple and assumes all the necessary information is in the class name, but you
might also want to add a member that contains the value of the index, if that
is useful.
Now the output is
Cat()
Cat()
Cat()
PWrap constructor
allocating a Dog
~Cat()
~Cat()
~Cat()
PWrap destructor
inside handler
Again, the storage allocation for Dog throws an
exception, but this time the array of Cat objects is properly cleaned
up, so there is no memory leak.
Since dynamic memory is the most frequent resource used in a
typical C++ program, the standard provides an RAII wrapper for pointers to heap
memory that automatically frees the memory. The auto_ptr class template, defined in the <memory> header, has a constructor that takes a
pointer to its generic type (whatever you use in your code). The auto_ptr
class template also overloads the pointer operators * and ->
to forward these operations to the original pointer the auto_ptr object
is holding. So you can use the auto_ptr object as if it were a raw pointer.
Here’s how it works:
//: C01:Auto_ptr.cpp
// Illustrates the RAII nature of auto_ptr.
#include <memory>
#include <iostream>
#include <cstddef>
using namespace std;
class TraceHeap {
int i;
public:
static void* operator new(size_t siz) {
void* p = ::operator new(siz);
cout << "Allocating TraceHeap object on
the heap "
<< "at address " << p
<< endl;
return p;
}
static void operator delete(void* p) {
cout << "Deleting TraceHeap object at
address "
<< p << endl;
::operator delete(p);
}
TraceHeap(int i) : i(i) {}
int getVal() const { return i; }
};
int main() {
auto_ptr<TraceHeap> pMyObject(new
TraceHeap(5));
cout << pMyObject->getVal() << endl;
// Prints 5
} ///:~
The TraceHeap class overloads the operator new
and operator delete so you can see exactly what’s happening. Notice
that, like any other class template, you specify the type you’re going to use
in a template parameter. You don’t say TraceHeap*, however—auto_ptr
already knows that it will be storing a pointer to your type. The second line
of main( ) verifies that auto_ptr’s operator->( )
function applies the indirection to the original, underlying pointer. Most
important, even though we didn’t explicitly delete the original pointer, pMyObject’s
destructor deletes the original pointer during stack unwinding, as the
following output verifies:
Allocating TraceHeap object on the heap at address
8930040
5
Deleting TraceHeap object at
address 8930040
The auto_ptr class template is also handy for pointer
data members. Since class objects contained by value are always destructed, auto_ptr
members always delete the raw pointer they wrap when the containing object is
destructed.
Since constructors can routinely throw exceptions, you might
want to handle exceptions that occur when an object’s member or base subobjects
are initialized. To do this, you can place the initialization of such
subobjects in a function-level try block. In a departure from the usual
syntax, the try block for constructor initializers is the constructor
body, and the associated catch block follows the body of the
constructor, as in the following example:
//: C01:InitExcept.cpp {-bor}
// Handles exceptions from subobjects.
#include <iostream>
using namespace std;
class Base {
int i;
public:
class BaseExcept {};
Base(int i) : i(i) { throw BaseExcept(); }
};
class Derived : public Base {
public:
class DerivedExcept {
const char* msg;
public:
DerivedExcept(const char* msg) : msg(msg) {}
const char* what() const { return msg; }
};
Derived(int j) try : Base(j) {
// Constructor body
cout << "This won't print" <<
endl;
} catch(BaseExcept&) {
throw DerivedExcept("Base subobject
threw");;
}
};
int main() {
try {
Derived d(3);
} catch(Derived::DerivedExcept& d) {
cout << d.what() << endl; //
"Base subobject threw"
}
} ///:~
Notice that the initializer list in the constructor for Derived
goes after the try keyword but before the constructor body. If an
exception does occur, the contained object is not constructed, so it makes no
sense to return to the code that created it. For this reason, the only sensible
thing to do is to throw an exception in the function-level catch clause.
Although it is not terribly useful, C++ also allows
function-level try blocks for any function, as the following
example illustrates:
//: C01:FunctionTryBlock.cpp {-bor}
// Function-level try blocks.
// {RunByHand} (Don’t run automatically by the
makefile)
#include <iostream>
using namespace std;
int main() try {
throw "main";
} catch(const char* msg) {
cout << msg << endl;
return 1;
} ///:~
In this case, the catch block can return in the same
manner that the function body normally returns. Using this type of
function-level try block isn’t much different from inserting a try-catch
around the code inside of the function body.
The exceptions used with the Standard C++ library are also available for your use. Generally it’s easier and faster to start with a
standard exception class than to try to define your own. If the standard class
doesn’t do exactly what you need, you can derive from it.
All standard exception classes derive ultimately from the class exception, defined in the header <exception>. The two main
derived classes are logic_error and runtime_error, which are found in <stdexcept> (which itself includes <exception>). The class logic_error represents errors in programming logic, such as passing an
invalid argument. Runtime errors are those that occur as the result of
unforeseen forces such as hardware failure or memory exhaustion. Both runtime_error
and logic_error provide a constructor that takes a std::string
argument so that you can store a message in the exception object and extract it
later with exception::what( ) , as the following program illustrates:
//: C01:StdExcept.cpp
// Derives an exception class from std::runtime_error.
#include <stdexcept>
#include <iostream>
using namespace std;
class MyError : public runtime_error {
public:
MyError(const string& msg = "") :
runtime_error(msg) {}
};
int main() {
try {
throw MyError("my message");
} catch(MyError& x) {
cout << x.what() << endl;
}
} ///:~
Although the runtime_error constructor inserts the
message into its std::exception subobject, std::exception does
not provide a constructor that takes a std::string argument. You’ll usually
want to derive your exception classes from either runtime_error or logic_error
(or one of their derivatives), and not from std::exception.
The following tables describe the standard exception
classes:
|
exception
|
The base class for all the exceptions thrown by the C++
Standard library. You can ask what( ) and retrieve the optional
string with which the exception was initialized.
|
|
logic_error
|
Derived from exception. Reports program logic
errors, which could presumably be detected by inspection.
|
|
runtime_error
|
Derived from exception. Reports runtime
errors, which can presumably be detected only when the program executes.
|
The iostream exception class ios::failure is also
derived from exception, but it has no further subclasses.
You can use the classes in both of the following tables as
they are, or you can use them as base classes from which to derive your own
more specific types of exceptions.
|
Exception classes derived from logic_error
|
|
domain_error
|
Reports violations of a precondition.
|
|
invalid_argument
|
Indicates an invalid argument to the function from which
it is thrown.
|
|
length_error
|
Indicates an attempt to
produce an object whose length is greater than or equal to npos (the
largest representable value of context’s size type, usually std::size_t).
|
|
out_of_range
|
Reports an out-of-range argument.
|
|
bad_cast
|
Thrown for executing an invalid dynamic_cast
expression in runtime type identification (see Chapter 8).
|
|
bad_typeid
|
Reports a null pointer p
in an expression typeid(*p). (Again, a runtime type identification
feature in Chapter 8).
|
|
Exception classes derived from runtime_error
|
|
range_error
|
Reports violation of a postcondition.
|
|
overflow_error
|
Reports an arithmetic overflow.
|
|
bad_alloc
|
Reports a failure to allocate storage.
|
You’re not required to inform the people using your function
what exceptions you might throw. However, failure to do so can be considered
uncivilized because it means that users cannot be sure what code to write to
catch all potential exceptions. If they have your source code, they can hunt
through and look for throw statements, but often a library doesn’t come
with sources. Good documentation can help alleviate this problem, but how many
software projects are well documented? C++ provides syntax to tell the user the
exceptions that are thrown by this function, so the user can handle them. This
is the optional exception specification, which adorns a function’s
declaration, appearing after the argument list.
The exception specification reuses the keyword throw,
followed by a parenthesized list of all the types of potential exceptions that
the function can throw. Your function declaration might look like this:
void f() throw(toobig, toosmall, divzero);
As far as exceptions are concerned, the traditional function
declaration
means that any type of exception can be thrown from
the function. If you say
no exceptions whatsoever will be thrown from the
function (so you’d better be sure that no functions farther down in the call
chain let any exceptions propagate up!).
For good coding policy, good documentation, and ease-of-use
for the function caller, consider using exception specifications when you write
functions that throw exceptions. (Variations on this guideline are discussed
later in this chapter.)
The unexpected( ) function
If your exception specification claims you’re going to throw a certain set of exceptions and then you throw something that isn’t in that set,
what’s the penalty? The special function unexpected( ) is called
when you throw something other than what appears in the exception
specification. Should this unfortunate situation occur, the default unexpected( )
calls the terminate( ) function described earlier in this
chapter.
The set_unexpected( ) function
Like terminate( ), the unexpected( ) mechanism installs your own function to respond to unexpected exceptions. You do so
with a function called set_unexpected( ), which, like set_terminate( ),
takes the address of a function with no arguments and void return value.
Also, because it returns the previous value of the unexpected( )
pointer, you can save it and restore it later. To use set_unexpected( ),
include the header file <exception>. Here’s an example that shows a
simple use of the features discussed so far in this section:
//: C01:Unexpected.cpp
// Exception specifications & unexpected(),
//{-msc} (Doesn’t terminate properly)
#include <exception>
#include <iostream>
using namespace std;
class Up {};
class Fit {};
void g();
void f(int i) throw(Up, Fit) {
switch(i) {
case 1: throw Up();
case 2: throw Fit();
}
g();
}
// void g() {} // Version 1
void g() { throw 47; } // Version 2
void my_unexpected() {
cout << "unexpected exception thrown"
<< endl;
exit(0);
}
int main() {
set_unexpected(my_unexpected); // (Ignores return
value)
for(int i = 1; i <=3; i++)
try {
f(i);
} catch(Up) {
cout << "Up caught" <<
endl;
} catch(Fit) {
cout << "Fit caught" <<
endl;
}
} ///:~
The classes Up and Fit are created solely to
throw as exceptions. Often exception classes will be small, but they can
certainly hold additional information so that the handlers can query for it.
The f( ) function promises in its exception specification
to throw only exceptions of type Up and Fit, and from looking at
the function definition, this seems plausible. Version one of g( ),
called by f( ), doesn’t throw any exceptions, so this is true. But
if someone changes g( ) so that it throws a different type of
exception (like the second version in this example, which throws an int),
the exception specification for f( ) is violated.
The my_unexpected( ) function has no arguments
or return value, following the proper form for a custom unexpected( )
function. It simply displays a message so that you can see that it was called,
and then exits the program (exit(0) is used here so that the book’s make
process is not aborted). Your new unexpected( ) function should not
have a return statement.
In main( ), the try block is within a for
loop, so all the possibilities are exercised. In this way, you can achieve
something like resumption. Nest the try block inside a for, while,
do, or if and cause any exceptions to attempt to repair the
problem; then attempt the try block again.
Only the Up and Fit exceptions are caught
because those are the only exceptions that the programmer of f( )
said would be thrown. Version two of g( ) causes my_unexpected( )
to be called because f( ) then throws an int.
In the call to set_unexpected( ), the return
value is ignored, but it can also be saved in a pointer to function and be
restored later, as we did in the set_terminate( ) example earlier
in this chapter.
A typical unexpected handler logs the error and
terminates the program by calling exit( ). It can, however, throw
another exception (or rethrow the same exception) or call abort( ).
If it throws an exception of a type allowed by the function whose specification
was originally violated, the search resumes at the call of the function
with this exception specification. (This behavior is unique to unexpected( ).)
If the exception thrown from your unexpected handler
is not allowed by the original function’s specification, one of the following
occurs:
1. If
std::bad_exception (defined in <exception>) was in the function’s exception specification, the exception thrown from the unexpected
handler is replaced with a std::bad_exception object, and the search resumes
from the function as before.
2. If
the original function’s specification did not include std::bad_exception,
terminate( ) is called.
The following program illustrates this behavior:
//: C01:BadException.cpp {-bor}
#include <exception> // For std::bad_exception
#include <iostream>
#include <cstdio>
using namespace std;
// Exception classes:
class A {};
class B {};
// terminate() handler
void my_thandler() {
cout << "terminate called" << endl;
exit(0);
}
// unexpected() handlers
void my_uhandler1() { throw A(); }
void my_uhandler2() { throw; }
// If we embed this throw statement in f or g,
// the compiler detects the violation and reports
// an error, so we put it in its own function.
void t() { throw B(); }
void f() throw(A) { t(); }
void g() throw(A, bad_exception) { t(); }
int main() {
set_terminate(my_thandler);
set_unexpected(my_uhandler1);
try {
f();
} catch(A&) {
cout << "caught an A from f"
<< endl;
}
set_unexpected(my_uhandler2);
try {
g();
} catch(bad_exception&) {
cout << "caught a bad_exception from
g" << endl;
}
try {
f();
} catch(...) {
cout << "This will never print"
<< endl;
}
} ///:~
The my_uhandler1( ) handler throws an acceptable
exception (A), so execution resumes at the first catch, which succeeds.
The my_uhandler2( ) handler does not throw a valid exception (B),
but since g specifies bad_exception, the B exception is
replaced by a bad_exception object, and the second catch also succeeds.
Since f does not include bad_exception in its specification, my_thandler( )
is called as a terminate handler. Here’s the output:
caught an A from f
caught a bad_exception from g
terminate called
You may feel that the existing exception specification rules
aren’t very safe, and that
should mean that no exceptions are thrown from this
function. If the programmer wants to throw any type of exception, you might
think he or she should have to say
void f() throw(...); // Not in C++
This would surely be an improvement because function
declarations would be more explicit. Unfortunately, you can’t always know by
looking at the code in a function whether an exception will be thrown—it could
happen because of a memory allocation, for example. Worse, existing functions
written before exception handling was introduced into the language may find
themselves inadvertently throwing exceptions because of the functions they call
(which might be linked into new, exception-throwing versions). Hence, the
uninformative situation whereby
means, “Maybe I’ll throw an exception, maybe I won’t.” This
ambiguity is necessary to avoid hindering code evolution. If you want to
specify that f throws no exceptions, use the empty list, as in:
Each public function in a class essentially forms a contract with the user; if you pass it certain arguments, it will perform
certain operations and/or return a result. The same contract must hold true in
derived classes; otherwise the expected “is-a” relationship between derived and
base classes is violated. Since exception specifications are logically part of
a function’s declaration, they too must remain consistent across an inheritance
hierarchy. For example, if a member function in a base class says it will only
throw an exception of type A, an override of that function in a derived
class must not add any other exception types to the specification list because
that would break any programs that adhere to the base class interface. You can,
however, specify fewer exceptions or none at all, since that
doesn’t require the user to do anything differently. You can also specify
anything that “is-a” A in place of A in the derived function’s
specification. Here’s an example.
//: C01:Covariance.cpp {-xo}
// Should cause compile error. {-mwcc}{-msc}
#include <iostream>
using namespace std;
class Base {
public:
class BaseException {};
class DerivedException : public BaseException {};
virtual void f() throw(DerivedException) {
throw DerivedException();
}
virtual void g() throw(BaseException) {
throw BaseException();
}
};
class Derived : public Base {
public:
void f() throw(BaseException) {
throw BaseException();
}
virtual void g() throw(DerivedException) {
throw DerivedException();
}
}; ///:~
A compiler should flag the override of Derived::f( )
with an error (or at least a warning) since it changes its exception
specification in a way that violates the specification of Base::f( ).
The specification for Derived::g( ) is acceptable because DerivedException
“is-a” BaseException (not the other way around). You can think of Base/Derived
and BaseException/DerivedException as parallel class hierarchies; when
you are in Derived, you can replace references to BaseException
in exception specifications and return values with DerivedException.
This behavior is called covariance (since both sets of classes vary down their respective hierarchies together). (Reminder from Volume 1: parameter types are not
covariant—you are not allowed to change the signature of an overridden virtual
function.)
If you peruse the function declarations throughout the Standard C++ library, you’ll find that not a single exception specification occurs
anywhere! Although this might seem strange, there is a good reason for this
seeming incongruity: the library consists mainly of templates, and you never
know what a generic type or function might do. For example, suppose you are
developing a generic stack template and attempt to affix an exception
specification to your pop function, like this:
T pop() throw(logic_error);
Since the only error you anticipate is a stack underflow,
you might think it’s safe to specify a logic_error or some other
appropriate exception type. But type T’s copy constructor could throw an
exception. Then unexpected( ) would be called, and your program
would terminate. You can’t make unsupportable guarantees. If you don’t know
what exceptions might occur, don’t use exception specifications. That’s why
template classes, which constitute the majority of the Standard C++ library, do
not use exception specifications—they specify the exceptions they know about in
documentation and leave the rest to you. Exception specifications are
mainly for non-template classes.
In Chapter 7 we’ll take an in-depth look at the containers
in the Standard C++ library, including the stack container. One thing
you’ll notice is that the declaration of the pop( ) member function
looks like this:
You might think it strange that pop( ) doesn’t
return a value. Instead, it just removes the element at the top of the stack.
To retrieve the top value, call top( ) before you call pop( ).
There is an important reason for this behavior, and it has to do with exception
safety, a crucial consideration in library design. There are different
levels of exception safety, but most importantly, and just as the name implies,
exception safety is about correct semantics in the face of exceptions.
Suppose you are implementing a stack with a dynamic array
(we’ll call it data and the counter integer count), and you try
to write pop( ) so that it returns a value. The code for such a pop( )
might look something like this:
template<class T> T stack<T>::pop() {
if(count == 0)
throw logic_error("stack underflow");
else
return data[--count];
}
What happens if the copy constructor that is called for the
return value in the last line throws an exception when the value is returned?
The popped element is not returned because of the exception, and yet count
has already been decremented, so the top element you wanted is lost forever!
The problem is that this function attempts to do two things at once: (1) return
a value, and (2) change the state of the stack. It is better to separate these
two actions into two separate member functions, which is exactly what the
standard stack class does. (In other words, follow the design practice
of cohesion—every function should do one thing well.) Exception-safe
code leaves objects in a consistent state and does not leak resources.
You also need to be careful writing custom assignment
operators. In Chapter 12 of Volume 1, you saw that operator= should
adhere to the following pattern:
1. Make
sure you’re not assigning to self. If you are, go to step 6. (This is strictly
an optimization.)
2. Allocate
new memory required by pointer data members.
3. Copy
data from the old memory to the new.
4. Delete
the old memory.
5. Update
the object’s state by assigning the new heap pointers to the pointer data
members.
6. Return
*this.
It’s important to not change the state of your object until
all the new pieces have been safely allocated and initialized. A good technique
is to move steps 2 and 3 into a separate function, often called clone( ).
The following example does this for a class that has two pointer members, theString
and theInts:
//: C01:SafeAssign.cpp
// An Exception-safe operator=.
#include <iostream>
#include <new> // For std::bad_alloc
#include <cstring>
#include <cstddef>
using namespace std;
// A class that has two pointer members using the heap
class HasPointers {
// A Handle class to hold the data
struct MyData {
const char* theString;
const int* theInts;
size_t numInts;
MyData(const char* pString, const int* pInts,
size_t nInts)
: theString(pString), theInts(pInts),
numInts(nInts) {}
} *theData; // The handle
// Clone and cleanup functions:
static MyData* clone(const char* otherString,
const int* otherInts, size_t nInts) {
char* newChars = new char[strlen(otherString)+1];
int* newInts;
try {
newInts = new int[nInts];
} catch(bad_alloc&) {
delete [] newChars;
throw;
}
try {
// This example uses built-in types, so it won't
// throw, but for class types it could throw, so
we
// use a try block for illustration. (This is the
// point of the example!)
strcpy(newChars, otherString);
for(size_t i = 0; i < nInts; ++i)
newInts[i] = otherInts[i];
} catch(...) {
delete [] newInts;
delete [] newChars;
throw;
}
return new MyData(newChars, newInts, nInts);
}
static MyData* clone(const MyData* otherData) {
return clone(otherData->theString, otherData->theInts,
otherData->numInts);
}
static void cleanup(const MyData* theData) {
delete [] theData->theString;
delete [] theData->theInts;
delete theData;
}
public:
HasPointers(const char* someString, const int*
someInts,
size_t numInts) {
theData = clone(someString, someInts, numInts);
}
HasPointers(const HasPointers& source) {
theData = clone(source.theData);
}
HasPointers& operator=(const HasPointers&
rhs) {
if(this != &rhs) {
MyData* newData = clone(rhs.theData->theString,
rhs.theData->theInts,
rhs.theData->numInts);
cleanup(theData);
theData = newData;
}
return *this;
}
~HasPointers() { cleanup(theData); }
friend ostream&
operator<<(ostream& os, const HasPointers&
obj) {
os << obj.theData->theString <<
": ";
for(size_t i = 0; i < obj.theData->numInts;
++i)
os << obj.theData->theInts[i] << '
';
return os;
}
};
int main() {
int someNums[] = { 1, 2, 3, 4 };
size_t someCount = sizeof someNums / sizeof
someNums[0];
int someMoreNums[] = { 5, 6, 7 };
size_t someMoreCount =
sizeof someMoreNums / sizeof someMoreNums[0];
HasPointers h1("Hello", someNums,
someCount);
HasPointers h2("Goodbye", someMoreNums,
someMoreCount);
cout << h1 << endl; // Hello: 1 2 3 4
h1 = h2;
cout << h1 << endl; // Goodbye: 5 6 7
} ///:~
For convenience, HasPointers uses the MyData
class as a handle to the two pointers. Whenever it’s time to allocate more
memory, whether during construction or assignment, the first clone
function is ultimately called to do the job. If memory fails for the first call
to the new operator, a bad_alloc exception is thrown
automatically. If it happens on the second allocation (for theInts), we must
clean up the memory for theString—hence the first try block that
catches a bad_alloc exception. The second try block isn’t crucial
here because we’re just copying ints and pointers (so no exceptions will
occur), but whenever you copy objects, their assignment operators can possibly
cause an exception, so everything needs to be cleaned up. In both exception
handlers, notice that we rethrow the exception. That’s because we’re just managing resources here; the user still needs to know that something
went wrong, so we let the exception propagate up the dynamic chain. Software
libraries that don’t silently swallow exceptions are called exception
neutral. Always strive to write libraries that are both exception safe and
exception neutral.
If you inspect the previous code closely, you’ll notice that
none of the delete operations will throw an exception. This code depends
on that fact. Recall that when you call delete on an object, the
object’s destructor is called. It turns out to be practically impossible to
design exception-safe code without assuming that destructors don’t throw
exceptions. Don’t let destructors throw exceptions. (We’re going to remind you
about this once more before this chapter is done).
For most programmers, especially C programmers, exceptions
are not available in their existing language and require some adjustment. Here
are guidelines for programming with exceptions.
Exceptions aren’t the answer to all problems; overuse can
cause trouble. The following sections point out situations where exceptions are
not warranted. The best advice for deciding when to use exceptions is to
throw exceptions only when a function fails to meet its specification.
Not for asynchronous events
The Standard C signal( ) system and any similar system handle asynchronous events: events that
happen outside the flow of a program, and thus events the program cannot
anticipate. You cannot use C++ exceptions to handle asynchronous events because
the exception and its handler are on the same call stack. That is, exceptions
rely on the dynamic chain of function calls on the program’s runtime stack (they
have “dynamic scope”), whereas asynchronous events must be handled by
completely separate code that is not part of the normal program flow
(typically, interrupt service routines or event loops). Don’t throw exceptions
from interrupt handlers.
This is not to say that asynchronous events cannot be associated
with exceptions. But the interrupt handler should do its job as quickly as
possible and then return. The typical way to handle this situation is to set a
flag in the interrupt handler, and check it synchronously in the mainline code.
Not for benign error conditions
If you have enough information to handle an error, it’s not
an exception. Take care of it in the current context rather than throwing an
exception to a larger context.
Also, C++ exceptions are not thrown for machine-level events
such as divide-by-zero. It’s
assumed that some other mechanism, such as the operating system or hardware,
deals with these events. In this way, C++ exceptions can be reasonably
efficient, and their use is isolated to program-level exceptional conditions.
Not for flow–of–control
An exception looks somewhat like an alternate return
mechanism and somewhat like a switch statement, so you might be tempted
to use an exception instead of these ordinary language mechanisms. This is a
bad idea, partly because the exception-handling system is significantly less
efficient than normal program execution. Exceptions are a rare event, so the
normal program shouldn’t pay for them. Also, exceptions from anything other
than error conditions are quite confusing to the user of your class or
function.
You’re not forced to use exceptions
Some programs are quite simple (small utilities, for
example). You might only need to take input and perform some processing. In
these programs, you might attempt to allocate memory and fail, try to open a
file and fail, and so on. It is acceptable in these programs to display a
message and exit the program, allowing the system to clean up the mess, rather
than to work hard to catch all exceptions and recover all the resources
yourself. Basically, if you don’t need exceptions, you’re not forced to use
them.
New exceptions, old code
Another situation that arises is the modification of an
existing program that doesn’t use exceptions. You might introduce a library
that does use exceptions and wonder if you need to modify all your code
throughout the program. Assuming you have an acceptable error-handling scheme
already in place, the most straightforward thing to do is surround the largest
block that uses the new library (this might be all the code in main( ))
with a try block, followed by a catch(...) and basic error
message). You can refine this to whatever degree necessary by adding more
specific handlers, but, in any case, the code you must add can be minimal. It’s
even better to isolate your exception-generating code in a try block and
write handlers to convert the exceptions into your existing error-handling
scheme.
It’s truly important to think about exceptions when you’re
creating a library for someone else to use, especially if you can’t know how
they need to respond to critical error conditions (recall the earlier
discussions on exception safety and why there are no exception specifications
in the Standard C++ Library).
Do use exceptions to do the following:
· Fix the problem and retry the function that caused the exception.
· Patch things up and continue without retrying the function.
· Do whatever you can in the current context and rethrow the same
exception to a higher context.
· Do whatever you can in the current context and throw a different
exception to a higher context.
· Terminate the program.
· Wrap functions (especially C library functions) that use ordinary
error schemes so they produce exceptions instead.
· Simplify. If your error handling scheme makes things more
complicated, it is painful and annoying to use. Exceptions can be used to make
error handling simpler and more effective.
· Make your library and program safer. This is a short-term
investment (for debugging) and a long-term investment (for application
robustness).
When to use exception specifications
The exception specification is like a function prototype: it
tells the user to write exception-handling code and what exceptions to handle.
It tells the compiler the exceptions that might come out of this function so
that it can detect violations at runtime.
You can’t always look at the code and anticipate which
exceptions will arise from a particular function. Sometimes, the functions it
calls produce an unexpected exception, and sometimes an old function that
didn’t throw an exception is replaced with a new one that does, and you get a
call to unexpected( ). Any time you use exception specifications or
call functions that do, consider creating your own unexpected( )
function that logs a message and then either throws an exception or aborts the
program.
As we explained earlier, you should avoid using exception
specifications in template classes, since you can’t anticipate what types of
exceptions the template parameter classes might throw.
Start with standard exceptions
Check out the Standard C++ library exceptions before
creating your own. If a standard exception does what you need, chances are it’s
a lot easier for your user to understand and handle.
If the exception type you want isn’t part of the standard
library, try to inherit one from an existing standard exception. It’s nice if
your users can always write their code to expect the what( ) function
defined in the exception( ) class interface.
Nest your own exceptions
If you create exceptions for your particular class, it’s a
good idea to nest the exception classes either inside your class or inside a
namespace containing your class, to provide a clear message to the reader that
this exception is only for your class. In addition, it prevents pollution of
the global namespace.
You can nest your exceptions even if you’re deriving them
from C++ Standard exceptions.
Use exception hierarchies
Using exception hierarchies is a valuable way to classify the types of critical errors that might be encountered with your class or library. This
gives helpful information to users, assists them in organizing their code, and
gives them the option of ignoring all the specific types of exceptions and just
catching the base-class type. Also, any exceptions added later by inheriting
from the same base class will not force all existing code to be rewritten—the
base-class handler will catch the new exception.
The Standard C++ exceptions are a good example of an
exception hierarchy. Build your exceptions on top of it if you can.
Multiple inheritance (MI)
As you’ll read in Chapter 9, the only essential place
for MI is if you need to upcast an object pointer to two different base
classes—that is, if you need polymorphic behavior with both of those base
classes. It turns out that exception hierarchies are useful places for multiple
inheritance because a base-class handler from any of the roots of the multiply
inherited exception class can handle the exception.
Catch by reference, not by value
As you saw in the section “Exception matching,” you should
catch exceptions by reference for two reasons:
· To avoid making a needless copy of the exception object when it
is passed to the handler.
· To avoid object slicing when catching a derived exception as a
base class object.
Although you can also throw and catch pointers, by doing so you introduce more coupling—the thrower and the catcher must agree on
how the exception object is allocated and cleaned up. This is a problem because
the exception itself might have occurred from heap exhaustion. If you throw exception
objects, the exception-handling system takes care of all storage.
Throw exceptions in constructors
Because a constructor has no return value, you’ve previously had two ways to report an error during construction:
· Set a nonlocal flag and hope the user checks it.
· Return an incompletely created object and hope the user checks
it.
This problem is serious because C programmers expect that
object creation is always successful, which is not unreasonable in C because
the types are so primitive. But continuing execution after construction fails in a C++ program is a guaranteed disaster, so constructors are one of the most
important places to throw exceptions—now you have a safe, effective way to
handle constructor errors. However, you must also pay attention to pointers
inside objects and the way cleanup occurs when an exception is thrown inside a
constructor.
Don’t cause exceptions in destructors
Because destructors are called in the process of throwing other exceptions, you’ll never want to throw an exception in a destructor
or cause another exception to be thrown by some action you perform in the
destructor. If this happens, a new exception can be thrown before the
catch-clause for an existing exception is reached, which will cause a call to terminate( ).
If you call any functions inside a destructor that can throw
exceptions, those calls should be within a try block in the destructor,
and the destructor must handle all exceptions itself. None must escape from the
destructor.
Avoid naked pointers
See Wrapped.cpp earlier in this chapter. A naked
pointer usually means vulnerability in the constructor if resources are
allocated for that pointer. A pointer doesn’t have a destructor, so those
resources aren’t released if an exception is thrown in the constructor. Use auto_ptr
or other smart pointer types for
pointers that reference heap memory.
When an exception is thrown, there’s considerable runtime
overhead (but it’s good overhead, since objects are cleaned up
automatically!). For this reason, you never want to use exceptions as part of
your normal flow-of-control, no matter how tempting and clever it may seem.
Exceptions should occur only rarely, so the overhead is piled on the exception
and not on the normally executing code. One of the important design goals for
exception handling was that it could be implemented with no impact on execution
speed when it wasn’t used; that is, as long as you don’t throw an
exception, your code runs as fast as it would without exception handling.
Whether this is true depends on the particular compiler implementation you’re
using. (See the description of the “zero-cost model” later in this section.)
You can think of a throw expression as a call to a
special system function that takes the exception object as an argument and backtracks
up the chain of execution. For this to work, extra information needs to be put
on the stack by the compiler, to aid in stack unwinding. To understand this,
you need to know about the runtime stack.
Whenever a function is called, information about that
function is pushed onto the runtime stack in an activation record instance (ARI), also called a stack frame. A typical stack frame contains
the address of the calling function (so execution can return to it), a pointer
to the ARI of the function’s static parent (the scope that lexically contains
the called function, so variables global to the function can be accessed), and
a pointer to the function that called it (its dynamic parent). The path
that logically results from repetitively following the dynamic parent links is
the dynamic chain, or call chain, that we’ve mentioned previously
in this chapter. This is how execution can backtrack when an exception is
thrown, and it is the mechanism that makes it possible for components developed
without knowledge of one another to communicate errors at runtime.
To enable stack unwinding for exception handling, extra
exception-related information about each function needs to be available for
each stack frame. This information describes which destructors need to be
called (so that local objects can be cleaned up), indicates whether the current
function has a try block, and lists which exceptions the associated
catch clauses can handle. There is space penalty for this extra information, so
programs that support exception handling can be somewhat larger than those that
don’t. Even the
compile-time size of programs using exception handling is greater, since the
logic of how to generate the expanded stack frames during runtime must be
generated by the compiler.
To illustrate this, we compiled the following program both
with and without exception-handling support in Borland C++ Builder and
Microsoft Visual C++:
//: C01:HasDestructor.cpp {O}
class HasDestructor {
public:
~HasDestructor() {}
};
void g(); // For all we know, g may throw.
void f() {
HasDestructor h;
g();
} ///:~
If exception handling is enabled, the compiler must keep
information about ~HasDestructor( ) available at runtime in the ARI
for f( ) (so it can destroy h properly should g( )
throw an exception). The following table summarizes the result of the
compilations in terms of the size of the compiled (.obj) files (in bytes).
|
Compiler\Mode
|
With Exception Support
|
Without Exception
Support
|
|
Borland
|
616
|
234
|
|
Microsoft
|
1162
|
680
|
Don’t take the percentage
differences between the two modes too seriously. Remember that exceptions
(should) typically constitute a small part of a program, so the space overhead
tends to be much smaller (usually between 5 and 15 percent).
This extra housekeeping slows down execution, but a clever
compiler implementation avoids this. Since information about exception-handling
code and the offsets of local objects can be computed once at compile time,
such information can be kept in a single place associated with each function,
but not in each ARI. You essentially remove exception overhead from each ARI
and thus avoid the extra time to push them onto the stack. This approach is
called the zero-cost model of exception handling, and the optimized storage mentioned earlier is known as the shadow
stack.
Error recovery is a fundamental concern for every program
you write. It’s especially important in C++ when creating program components
for others to use. To create a robust system, each component must be robust.
The goals for exception handling in C++ are to simplify the
creation of large, reliable programs using less code than currently possible,
with more confidence that your application doesn’t have an unhandled error.
This is accomplished with little or no performance penalty and with low impact
on existing code.
Basic exceptions are not terribly difficult to learn; begin
using them in your programs as soon as you can. Exceptions are one of those
features that provide immediate and significant benefits to your project.
Solutions
to selected exercises can be found in the electronic document The Thinking
in C++ Volume 2 Annotated Solution Guide, available for a small fee from www.MindView.net.
1. Write three functions: one
that returns an error value to indicate an error condition, one that sets errno,
and one that uses signal( ). Write code that calls these functions
and responds to the errors. Now write a fourth function that throws an
exception. Call this function and catch the exception. Describe the differences
between these four approaches, and why exception handling is an improvement.
2. Create a class with member functions that throw exceptions.
Within this class, make a nested class to use as an exception object. It takes
a single const char* as its argument; this represents a description
string. Create a member function that throws this exception. (State this in the
function’s exception specification.) Write a try block that calls this
function and a catch clause that handles the exception by displaying its
description string.
3. Rewrite the Stash class from Chapter 13 of Volume 1 so
that it throws out_of_range exceptions for operator[ ].
4. Write a generic main( ) that takes all exceptions and
reports them as errors.
5. Create a class with its own operator new. This operator
should allocate ten objects, and on the eleventh object “run out of memory” and
throw an exception. Also add a static member function that reclaims this
memory. Now create a main( ) with a try block and a catch
clause that calls the memory-restoration routine. Put these inside a while
loop, to demonstrate recovering from an exception and continuing execution.
6. Create a destructor that throws an exception, and write code to
prove to yourself that this is a bad idea by showing that if a new exception is
thrown before the handler for the existing one is reached, terminate( )
is called.
7. Prove to yourself that all exception objects (the ones that are
thrown) are properly destroyed.
8. Prove to yourself that if you create an exception object on the
heap and throw the pointer to that object, it will not be cleaned up.
9. Write a function with an exception specification that can throw
four exception types: a char, an int, a bool, and your own
exception class. Catch each in main( ) and verify the catch. Derive
your exception class from a standard exception. Write the function in such a
way that the system recovers and tries to execute it again.
10. Modify your solution to the previous exercise to throw a double
from the function, violating the exception specification. Catch the violation
with your own unexpected handler that displays a message and exits the program
gracefully (meaning abort( ) is not called).
11. Write a Garage class that has a Car that is having
troubles with its Motor. Use a function-level try block in the Garage
class constructor to catch an exception (thrown from the Motor class)
when its Car object is initialized. Throw a different exception from the
body of the Garage constructor’s handler and catch it in main( ).
Writing “perfect software” may be an elusive goal for
developers, but a few defensive techniques, routinely applied, can go a long
way toward improving the quality of your code.
Although the complexity of typical production software
guarantees that testers will always have a job, we hope you still yearn to
produce defect-free software. Object-oriented design techniques do much to
corral the difficulty of large projects, but eventually you must write loops
and functions. These details of “programming in the small” become the building
blocks of the larger components needed for your designs. If your loops are off
by one or your functions calculate the correct values only “most” of the time,
you’re in trouble no matter how fancy your overall methodology. In this chapter, you’ll see practices that help create robust code regardless of the size of your
project.
Your code is, among other things, an expression of your attempt
to solve a problem. It should be clear to the reader (including yourself)
exactly what you were thinking when you designed that loop. At certain points
in your program, you should be able to make bold statements that some condition
or other holds. (If you can’t, you really haven’t yet solved the problem.) Such
statements are called invariants, since they should invariably be true
at the point where they appear in the code; if not, either your design is faulty,
or your code does not accurately reflect your design.
Consider a program that plays the guessing game of Hi-Lo. One
person thinks of a number between 1 and 100, and the other person guesses the
number. (We’ll let the computer do the guessing.) The person who holds the
number tells the guesser whether their guess is high, low or correct. The best
strategy for the guesser is a binary search, which chooses the midpoint
of the range of numbers where the sought-after number resides. The high-low
response tells the guesser which half of the list holds the number, and the
process repeats, halving the size of the active search range on each iteration.
So how do you write a loop to drive the repetition properly? It’s not
sufficient to just say
bool guessed = false;
while(!guessed) {
...
}
because a malicious user might respond deceitfully, and you
could spend all day guessing. What assumption, however simple, are you making
each time you guess? In other words, what condition should hold by design
on each loop iteration?
The simple assumption is that the secret number is within
the current active range of unguessed numbers: [1, 100]. Suppose we label the
endpoints of the range with the variables low and high.
Each time you pass through the loop you need to make sure that if the number
was in the range [low, high] at the beginning of the loop, you
calculate the new range so that it still contains the number at the end of the
current loop iteration.
The goal is to express the loop invariant in code so that a
violation can be detected at runtime. Unfortunately, since the computer doesn’t
know the secret number, you can’t express this condition directly in code, but
you can at least make a comment to that effect:
while(!guessed) {
// INVARIANT: the number is in the range [low, high]
...
}
What happens when the user says that a guess is too high or
too low when it isn’t? The deception will exclude the secret number from the
new subrange. Because one lie always leads to another, eventually your range
will diminish to nothing (since you shrink it by half each time and the secret
number isn’t in there). We can express this condition in the following program:
//: C02:HiLo.cpp {RunByHand}
// Plays the game of Hi-Lo to illustrate a loop
invariant.
#include <cstdlib>
#include <iostream>
#include <string>
using namespace std;
int main() {
cout << "Think of a number between 1 and
100" << endl
<< "I will make a
guess; "
<< "tell me if I'm
(H)igh or (L)ow" << endl;
int low = 1, high = 100;
bool guessed = false;
while(!guessed) {
// Invariant: the number is in the range [low,
high]
if(low > high) { // Invariant violation
cout << "You cheated! I quit"
<< endl;
return EXIT_FAILURE;
}
int guess = (low + high) / 2;
cout << "My guess is " <<
guess << ". ";
cout << "(H)igh, (L)ow, or (E)qual?
";
string response;
cin >> response;
switch(toupper(response[0])) {
case 'H':
high = guess - 1;
break;
case 'L':
low = guess + 1;
break;
case 'E':
guessed = true;
break;
default:
cout << "Invalid response"
<< endl;
continue;
}
}
cout << "I got it!" << endl;
return EXIT_SUCCESS;
} ///:~
The violation of the invariant is detected with the
condition if(low > high), because if the user always tells the truth,
we will always find the secret number before we run out of guesses.
We also use a standard C technique for reporting program
status to the calling context by returning different values from main( ).
It is portable to use the statement return 0; to indicate success, but
there is no portable value to indicate failure. For this reason we use the
macro declared for this purpose in <cstdlib>: EXIT_FAILURE.
For consistency, whenever we use EXIT_FAILURE we also use EXIT_SUCCESS,
even though the latter is always defined as zero.
The condition in the Hi-Lo program depends on user input, so
you can’t prevent a violation of the invariant. However, invariants usually depend
only on the code you write, so they will always hold if you’ve implemented your
design correctly. In this case, it is clearer to make an assertion, which is a positive statement that reveals your design decisions.
Suppose you are implementing a vector of integers: an
expandable array that grows on demand. The function that adds an element to the
vector must first verify that there is an open slot in the underlying array
that holds the elements; otherwise, it needs to request more heap space and
copy the existing elements to the new space before adding the new element (and
deleting the old array). Such a function might look like the following:
void MyVector::push_back(int x) {
if(nextSlot == capacity)
grow();
assert(nextSlot < capacity);
data[nextSlot++] = x;
}
In this example, data is a dynamic array of ints
with capacity slots and nextSlot slots in use. The purpose of grow( )
is to expand the size of data so that the new value of capacity
is strictly greater than nextSlot. Proper behavior of MyVector
depends on this design decision, and it will never fail if the rest of the
supporting code is correct. We assert the condition with the assert( )
macro, which is defined in the header <cassert>.
The Standard C library assert( ) macro is brief,
to the point, and portable. If the condition in its parameter evaluates to
non-zero, execution continues uninterrupted; if it doesn’t, a message
containing the text of the offending expression along with its source file name
and line number is printed to the standard error channel and the program
aborts. Is that too drastic? In practice, it is much more drastic to let
execution continue when a basic design assumption has failed. Your program
needs to be fixed.
If all goes well, you will thoroughly test your code with
all assertions intact by the time the final product is deployed. (We’ll say
more about testing later.) Depending on the nature of your application, the
machine cycles needed to test all assertions at runtime might be too much of a
performance hit in the field. If that’s the case, you can remove all the
assertion code automatically by defining the macro NDEBUG and rebuilding
the application.
To see how this works, note that a typical implementation of
assert( ) looks something like this:
#ifdef NDEBUG
#define assert(cond) ((void)0)
#else
void assertImpl(const char*, const char*, long);
#define assert(cond) \
((cond) ? (void)0 : assertImpl(???))
#endif
When the macro NDEBUG is defined, the code decays to
the expression (void) 0, so all that’s left in the compilation stream is
an essentially empty statement as a result of the semicolon you appended to
each assert( ) invocation. If NDEBUG is not defined, assert(cond)
expands to a conditional statement that, when cond is zero, calls a
compiler-dependent function (which we named assertImpl( )) with a
string argument representing the text of cond, along with the file name
and line number where the assertion appeared. (We used “???” as a place holder
in the example, but the string mentioned is actually computed there, along with
the file name and the line number where the macro occurs in that file. How
these values are obtained is immaterial to our discussion.) If you want to turn
assertions on and off at different points in your program, you must not only #define
or #undef NDEBUG, but you must also re-include <cassert>.
Macros are evaluated as the preprocessor encounters them and thus use whatever NDEBUG
state applies at the point of inclusion. The most common way to define NDEBUG
once for an entire program is as a compiler option, whether through project
settings in your visual environment or via the command line, as in:
Most compilers use the –D flag to define macro names.
(Substitute the name of your compiler’s executable for mycc above.) The
advantage of this approach is that you can leave your assertions in the source
code as an invaluable bit of documentation, and yet there is no runtime
penalty. Because the code in an assertion disappears when NDEBUG is
defined, it is important that you never do work in an assertion. Only
test conditions that do not change the state of your program.
Whether using NDEBUG for released code is a good idea
remains a subject of debate. Tony Hoare, one of the most influential computer
scientists of all time, has
suggested that turning off runtime checks such as assertions is similar to a
sailing enthusiast who wears a life jacket while training on land and then
discards it when he goes to sea. If
an assertion fails in production, you have a problem much worse than
degradation in performance, so choose wisely.
Not all conditions should be enforced by assertions. User
errors and runtime resource failures should be signaled by throwing exceptions,
as we explained in detail in Chapter 1. It is tempting to use assertions for
most error conditions while roughing out code, with the intent to replace many
of them later with robust exception handling. Like any other temptation, use
caution, since you might forget to make all the necessary changes later.
Remember: assertions are intended to verify design decisions that will only
fail because of faulty programmer logic. The ideal is to solve all assertion
violations during development. Don’t use assertions for conditions that aren’t
totally in your control (for example, conditions that depend on user input). In
particular, you wouldn’t want to use assertions to validate function arguments;
throw a logic_error instead.
The use of assertions as a tool to ensure program
correctness was formalized by Bertrand Meyer in his Design by Contract methodology. Every
function has an implicit contract with clients that, given certain preconditions, guarantees certain postconditions. In other words, the preconditions
are the requirements for using the function, such as supplying arguments within
certain ranges, and the postconditions are the results delivered by the
function, either by return value or by side-effect.
When client programs fail to give you valid input, you must
tell them they have broken the contract. This is not the best time to abort the
program (although you’re justified in doing so since the contract was
violated), but an exception is certainly appropriate. This is why the Standard
C++ library throws exceptions derived from logic_error, such as out_of_range. If there are
functions that only you call, however, such as private functions in a class of
your own design, the assert( ) macro is appropriate, since you have
total control over the situation and you certainly want to debug your code
before shipping.
A postcondition failure indicates a program error, and it is
appropriate to use assertions for any invariant at any time, including the
postcondition test at the end of a function. This applies in particular to
class member functions that maintain the state of an object. In the MyVector
example earlier, for instance, a reasonable invariant for all public member
functions would be:
assert(0 <= nextSlot && nextSlot <=
capacity);
or, if nextSlot is an unsigned integer, simply
assert(nextSlot <= capacity);
Such an invariant is called a class invariant and can reasonably be enforced by an assertion. Subclasses play the role of subcontractor
to their base classes because they must maintain the original contract between the
base class and its clients. For this reason, the preconditions in derived
classes must impose no extra requirements beyond those in the base contract,
and the postconditions must deliver at least as much.
Validating results returned to the client, however, is
nothing more or less than testing, so using post-condition assertions in
this case would be duplicating work. Yes, it’s good documentation, but more
than one developer has been fooled into improperly using post-condition
assertions as a substitute for unit testing.
Writing software is all about meeting requirements. Creating these
requirements is difficult, and they can change from day to day; you might
discover at a weekly project meeting that what you just spent the week doing is
not exactly what the users really want.
People cannot articulate software requirements without
sampling an evolving, working system. It’s much better to specify a little,
design a little, code a little, and test a little. Then, after evaluating the
outcome, do it all over again. The ability to develop in such an iterative
fashion is one of the great advances of the object-oriented approach, but it
requires nimble programmers who can craft resilient code. Change is hard.
Another impetus for change comes from you, the programmer.
The craftsperson in you wants to continually improve the design of your code.
What maintenance programmer hasn’t cursed the aging, flagship company product
as a convoluted, unmodifiable patchwork of spaghetti? Management’s reluctance
to let you tamper with a functioning system robs code of the resilience it
needs to endure. “If it’s not broken, don’t fix it” eventually gives way to, “We
can’t fix it—rewrite it.” Change is necessary.
Fortunately, our industry is growing accustomed to the
discipline of refactoring, the art of internally restructuring code to
improve its design, without changing its behavior. Such
improvements include extracting a new function from another, or inversely,
combining member functions; replacing a member function with an object;
parameterizing a member function or class; and replacing conditionals with
polymorphism. Refactoring helps code evolve.
Whether the force for change comes from users or
programmers, changes today may break what worked yesterday. We need a way to
build code that withstands change and improves over time.
Extreme Programming (XP) is only one of
many practices that support a quick-on-your-feet motif. In this section we
explore what we think is the key to making flexible, incremental development
succeed: an easy-to-use automated unit test framework. (Note that testers,
software professionals who test others’ code for a living, are still indispensable.
Here, we are merely describing a way to help developers write better code.)
Developers write unit tests to gain the confidence to
say the two most important things that any developer can say:
1. I understand the requirements.
2. My code meets those requirements (to the best of my knowledge).
There is no better way to ensure that you know what the code
you’re about to write should do than to write the unit tests first. This simple
exercise helps focus the mind on the task ahead and will likely lead to working
code faster than just jumping into coding. Or, to express it in XP terms:
Testing + programming is faster
than just programming.
Writing tests first also guards you against boundary
conditions that might break your code, so your code is more robust.
When your code passes all your tests, you know that if the
system isn’t working, your code is probably not the problem. The statement “All
my tests pass” is a powerful argument.
So what does a unit test look like? Too often developers
just use some well-behaved input to produce some expected output, which they
inspect visually. Two dangers exist in this approach. First, programs don’t
always receive only well-behaved input. We all know that we should test the
boundaries of program input, but it’s hard to think about this when you’re
trying to just get things working. If you write the test for a function first
before you start coding, you can wear your “tester hat” and ask yourself, “What
could possibly make this break?” Code a test that will prove the function you’ll
write isn’t broken, and then put on your developer hat and make it happen. You’ll
write better code than if you hadn’t written the test first.
The second danger is that inspecting output visually is
tedious and error prone. Most any such thing a human can do a computer can do,
but without human error. It’s better to formulate tests as collections of Boolean expressions and have a test program report any failures.
For example, suppose you need to build a Date class
that has the following properties:
· A date can be initialized with a string (YYYYMMDD), three
integers (Y, M, D), or nothing (giving today’s date).
· A date object can yield its year, month, and day or a string of
the form “YYYYMMDD”.
· All relational comparisons are available, as well as computing
the duration between two dates (in years, months, and days).
· Dates to be compared need to be able to span an arbitrary number
of centuries (for example, 1600–2200).
Your class can store three integers representing the year,
month, and day. (Just be sure the year is at least 16 bits in size to satisfy
the last bulleted item.) The interface for your Date class might look
like this:
//: C02:Date1.h
// A first pass at Date.h.
#ifndef DATE1_H
#define DATE1_H
#include <string>
class Date {
public:
// A struct to hold elapsed time:
struct Duration {
int years;
int months;
int days;
Duration(int y, int m, int d)
: years(y), months(m), days(d) {}
};
Date();
Date(int year, int month, int day);
Date(const std::string&);
int getYear() const;
int getMonth() const;
int getDay() const;
std::string toString() const;
friend bool operator<(const
Date&, const Date&);
friend bool operator>(const
Date&, const Date&);
friend bool operator<=(const
Date&, const Date&);
friend bool operator>=(const
Date&, const Date&);
friend bool operator==(const
Date&, const Date&);
friend bool operator!=(const
Date&, const Date&);
friend Duration duration(const Date&, const
Date&);
};
#endif // DATE1_H ///:~
Before you implement this class, you can solidify your grasp
of the requirements by writing the beginnings of a test program. You might come
up with something like the following:
//: C02:SimpleDateTest.cpp
//{L} Date
#include <iostream>
#include "Date.h" // From Appendix B
using namespace std;
// Test machinery
int nPass = 0, nFail = 0;
void test(bool t) { if(t) nPass++; else nFail++; }
int main() {
Date mybday(1951, 10, 1);
test(mybday.getYear() == 1951);
test(mybday.getMonth() == 10);
test(mybday.getDay() == 1);
cout << "Passed: " << nPass
<< ", Failed: "
<< nFail << endl;
}
/* Expected output:
Passed: 3, Failed: 0
*/ ///:~
In this trivial case, the function test( )
maintains the global variables nPass and nFail. The only visual
inspection you do is to read the final score. If a test failed, a more
sophisticated test( ) displays an appropriate message. The
framework described later in this chapter has such a test function, among other
things.
You can now implement enough of the Date class to get
these tests to pass, and then you can proceed iteratively until all the
requirements are met. By writing tests first, you are more likely to think of
corner cases that might break your upcoming implementation, and you’re more
likely to write the code correctly the first time. Such an exercise might
produce the following version of a test for the Date class:
//: C02:SimpleDateTest2.cpp
//{L} Date
#include <iostream>
#include "Date.h"
using namespace std;
// Test machinery
int nPass = 0, nFail = 0;
void test(bool t) { if(t) ++nPass; else ++nFail; }
int main() {
Date mybday(1951, 10, 1);
Date today;
Date
myevebday("19510930");
// Test the operators
test(mybday < today);
test(mybday <= today);
test(mybday != today);
test(mybday == mybday);
test(mybday >= mybday);
test(mybday <= mybday);
test(myevebday < mybday);
test(mybday > myevebday);
test(mybday >= myevebday);
test(mybday != myevebday);
// Test the functions
test(mybday.getYear() == 1951);
test(mybday.getMonth() == 10);
test(mybday.getDay() == 1);
test(myevebday.getYear() == 1951);
test(myevebday.getMonth() == 9);
test(myevebday.getDay() == 30);
test(mybday.toString() == "19511001");
test(myevebday.toString() == "19510930");
// Test duration
Date d2(2003, 7, 4);
Date::Duration dur = duration(mybday, d2);
test(dur.years == 51);
test(dur.months == 9);
test(dur.days == 3);
// Report results:
cout << "Passed: " << nPass
<< ", Failed: "
<< nFail << endl;
} ///:~
This test can be more fully developed. For example, we haven’t
tested that long durations are handled correctly. We’ll stop here, but you get
the idea. The full implementation for the Date class is available in the
files Date.h and Date.cpp in the appendix.
Some automated C++ unit test tools are available on the
World Wide Web for download, such as CppUnit. Our
purpose here is not only to present a test mechanism that is easy to use, but
also easy to understand internally and even modify if necessary. So, in the
spirit of “Do The Simplest Thing That Could Possibly Work,” we
have developed the TestSuite Framework, a namespace named TestSuite
that contains two key classes: Test and Suite.
The Test class is an abstract base class from which you
derive a test object. It keeps track of the number of passes and failures and
displays the text of any test condition that fails. You simply to override the run( )
member function, which should in turn call the test_( ) macro for
each Boolean test condition you define.
To define a test for the Date class using the
framework, you can inherit from Test as shown in the following program:
//: C02:DateTest.h
#ifndef DATETEST_H
#define DATETEST_H
#include "Date.h"
#include "../TestSuite/Test.h"
class DateTest : public TestSuite::Test {
Date mybday;
Date today;
Date myevebday;
public:
DateTest(): mybday(1951, 10, 1),
myevebday("19510930") {}
void run() {
testOps();
testFunctions();
testDuration();
}
void testOps() {
test_(mybday < today);
test_(mybday <= today);
test_(mybday != today);
test_(mybday == mybday);
test_(mybday >= mybday);
test_(mybday <= mybday);
test_(myevebday < mybday);
test_(mybday > myevebday);
test_(mybday >= myevebday);
test_(mybday != myevebday);
}
void testFunctions() {
test_(mybday.getYear() == 1951);
test_(mybday.getMonth() == 10);
test_(mybday.getDay() == 1);
test_(myevebday.getYear() == 1951);
test_(myevebday.getMonth() == 9);
test_(myevebday.getDay() == 30);
test_(mybday.toString() == "19511001");
test_(myevebday.toString() ==
"19510930");
}
void testDuration() {
Date d2(2003, 7, 4);
Date::Duration dur = duration(mybday, d2);
test_(dur.years == 51);
test_(dur.months == 9);
test_(dur.days == 3);
}
};
#endif // DATETEST_H ///:~
Running the test is a simple matter of instantiating a DateTest
object and calling its run( ) member function:
//: C02:DateTest.cpp
// Automated testing (with a framework).
//{L} Date ../TestSuite/Test
#include <iostream>
#include "DateTest.h"
using namespace std;
int main() {
DateTest test;
test.run();
return test.report();
}
/* Output:
Test "DateTest":
Passed: 21, Failed: 0
*/ ///:~
The Test::report( ) function displays the
previous output and returns the number of failures, so it is suitable to use as
a return value from main( ).
The Test class uses RTTI to
get the name of your class (for example, DateTest) for the report. There
is also a setStream( ) member function if you want the test results
sent to a file instead of to the standard output (the default). You’ll see the Test
class implementation later in this chapter.
The test_( ) macro can extract the text of the
Boolean condition that fails, along with its file name and line number. To see what
happens when a failure occurs, you can introduce an intentional error in the
code, for example by reversing the condition in the first call to test_( )
in DateTest::testOps( ) in the previous example code. The output
indicates exactly what test was in error and where it happened:
DateTest failure: (mybday > today) , DateTest.h
(line 31)
Test "DateTest":
Passed: 20 Failed: 1
In addition to test_( ), the framework includes
the functions succeed_( ) and fail_( ), for cases where
a Boolean test won’t do. These functions apply when the class you’re testing
might throw exceptions. During testing, create an input set that will cause the
exception to occur. If it doesn’t, it’s an error and you call fail_( )
explicitly to display a message and update the failure count. If it does throw
the exception as expected, you call succeed_( ) to update the
success count.
To illustrate, suppose we modify the specification of the
two non-default Date constructors to throw a DateError exception
(a type nested inside Date and derived from std::logic_error) if
the input parameters do not represent a valid date:
Date(const string& s) throw(DateError);
Date(int year, int month, int day) throw(DateError);
The DateTest::run( ) member function can now
call the following function to test the exception handling:
void testExceptions() {
try {
Date d(0,0,0); // Invalid
fail_("Invalid date undetected in Date int
ctor");
} catch(Date::DateError&) {
succeed_();
}
try {
Date d(""); // Invalid
fail_("Invalid date undetected in Date
string ctor");
} catch(Date::DateError&) {
succeed_();
}
}
In both cases, if an exception is not thrown, it is an
error. Notice that you must manually pass a message to fail_( ),
since no Boolean expression is being evaluated.
Real projects usually contain many classes, so you need a
way to group tests so that you can just push a single button to test the entire
project. The Suite
class collects tests into a functional unit. You add Test objects to a Suite
with the addTest( ) member function, or you can include an entire
existing suite with addSuite( ). To illustrate, the following
example collects the programs in Chapter 3 that use the Test class into
a single suite. Note that this file will appear in the Chapter 3 subdirectory:
//: C03:StringSuite.cpp
//{L} ../TestSuite/Test
../TestSuite/Suite
//{L} TrimTest
// Illustrates a test suite
for code from Chapter 3
#include <iostream>
#include
"../TestSuite/Suite.h"
#include
"StringStorage.h"
#include "Sieve.h"
#include "Find.h"
#include "Rparse.h"
#include
"TrimTest.h"
#include "CompStr.h"
using namespace std;
using namespace TestSuite;
int main() {
Suite suite("String
Tests");
suite.addTest(new
StringStorageTest);
suite.addTest(new
SieveTest);
suite.addTest(new
FindTest);
suite.addTest(new
RparseTest);
suite.addTest(new TrimTest);
suite.addTest(new
CompStrTest);
suite.run();
long nFail =
suite.report();
suite.free();
return nFail;
}
/* Output:
s1 = 62345
s2 = 12345
Suite "String Tests"
====================
Test
"StringStorageTest":
Passed: 2 Failed: 0
Test "SieveTest":
Passed: 50 Failed: 0
Test "FindTest":
Passed: 9 Failed: 0
Test "RparseTest":
Passed: 8 Failed: 0
Test "TrimTest":
Passed: 11 Failed: 0
Test "CompStrTest":
Passed: 8 Failed: 0
*/ ///:~
Five of the above tests are completely contained in header
files. TrimTest is not, because it contains static data that must be defined
in an implementation file. The two first two output lines are trace lines from
the StringStorage test. You must give the suite a name as a constructor
argument. The Suite::run( ) member function calls Test::run( )
for each of its contained tests. Much the same thing happens for Suite::report( ),
except that you can send the individual test reports to a different destination
stream than that of the suite report. If the test passed to addSuite( )
already has a stream pointer assigned, it keeps it. Otherwise, it gets its
stream from the Suite object. (As with Test, there is an optional
second argument to the suite constructor that defaults to std::cout.)
The destructor for Suite does not automatically delete the contained Test
pointers because they don’t need to reside on the heap; that’s the job of Suite::free( ).
The test framework code is in a subdirectory called TestSuite
in the code distribution available at www.MindView.net. To use it, include the
search path for the TestSuite subdirectory in your header, link the
object files, and include the TestSuite subdirectory in the library
search path. Here is the header for Test.h:
//: TestSuite:Test.h
#ifndef TEST_H
#define TEST_H
#include <string>
#include <iostream>
#include <cassert>
using std::string;
using std::ostream;
using std::cout;
// fail_() has an underscore to prevent collision with
// ios::fail(). For consistency, test_() and succeed_()
// also have underscores.
#define test_(cond) \
do_test(cond, #cond, __FILE__, __LINE__)
#define fail_(str) \
do_fail(str, __FILE__, __LINE__)
namespace TestSuite {
class Test {
ostream* osptr;
long nPass;
long nFail;
// Disallowed:
Test(const Test&);
Test& operator=(const Test&);
protected:
void do_test(bool cond, const string& lbl,
const char* fname, long lineno);
void do_fail(const string& lbl,
const char* fname, long lineno);
public:
Test(ostream* osptr = &cout) {
this->osptr = osptr;
nPass = nFail = 0;
}
virtual ~Test() {}
virtual void run() = 0;
long getNumPassed() const { return nPass; }
long getNumFailed() const { return nFail; }
const ostream* getStream() const { return osptr; }
void setStream(ostream* osptr) { this->osptr =
osptr; }
void succeed_() { ++nPass; }
long report() const;
virtual void reset() { nPass = nFail = 0; }
};
} // namespace TestSuite
#endif // TEST_H ///:~
There are three virtual functions in the Test class:
· A virtual destructor
· The function reset( )
· The pure virtual function run( )
As explained in Volume 1, it is an error to delete a derived
heap object through a base pointer unless the base class has a virtual
destructor. Any class intended to be a base class (usually evidenced by the
presence of at least one other virtual function) should have a virtual
destructor. The default implementation of the Test::reset( ) resets
the success and failure counters to zero. You might want to override this function
to reset the state of the data in your derived test object; just be sure to
call Test::reset( ) explicitly in your override so that the
counters are reset. The Test::run( ) member function is pure
virtual since you are required to override it in your derived class.
The test_( ) and fail_( ) macros can
include file name and line number information available from the preprocessor.
We originally omitted the trailing underscores in the names, but the fail( )
macro then collided with ios::fail( ), causing compiler errors.
Here is the implementation of the remainder of the Test
functions:
//: TestSuite:Test.cpp {O}
#include "Test.h"
#include <iostream>
#include <typeinfo>
using namespace std;
using namespace TestSuite;
void Test::do_test(bool cond, const std::string&
lbl,
const char* fname, long lineno) {
if(!cond)
do_fail(lbl, fname, lineno);
else
succeed_();
}
void Test::do_fail(const std::string& lbl,
const char* fname, long lineno) {
++nFail;
if(osptr) {
*osptr << typeid(*this).name()
<< "failure: (" << lbl
<< ") , " << fname
<< " (line " << lineno
<< ")" << endl;
}
}
long Test::report() const {
if(osptr) {
*osptr << "Test \"" <<
typeid(*this).name()
<< "\":\n\tPassed: "
<< nPass
<< "\tFailed: " <<
nFail
<< endl;
}
return nFail;
} ///:~
The Test class keeps track of the number of successes
and failures as well as the stream where you want Test::report( )
to display the results. The test_( ) and fail_( )
macros extract the current file name and line number information from the
preprocessor and pass the file name to do_test( ) and the line
number to do_fail( ), which do the actual work of displaying a
message and updating the appropriate counter. We can’t think of a good reason
to allow copy and assignment of test objects, so we have disallowed these
operations by making their prototypes private and omitting their respective
function bodies.
Here is the header file for Suite:
//: TestSuite:Suite.h
#ifndef SUITE_H
#define SUITE_H
#include <vector>
#include <stdexcept>
#include "../TestSuite/Test.h"
using std::vector;
using std::logic_error;
namespace TestSuite {
class TestSuiteError : public logic_error {
public:
TestSuiteError(const string& s = "")
: logic_error(s) {}
};
class Suite {
string name;
ostream* osptr;
vector<Test*> tests;
void reset();
// Disallowed ops:
Suite(const Suite&);
Suite& operator=(const Suite&);
public:
Suite(const string& name, ostream* osptr =
&cout)
: name(name) { this->osptr = osptr; }
string getName() const { return name; }
long getNumPassed() const;
long getNumFailed() const;
const ostream* getStream() const { return osptr; }
void setStream(ostream* osptr) { this->osptr =
osptr; }
void addTest(Test* t) throw(TestSuiteError);
void addSuite(const Suite&);
void run(); // Calls Test::run() repeatedly
long report() const;
void free(); // Deletes tests
};
} // namespace TestSuite
#endif // SUITE_H ///:~
The Suite class holds pointers to its Test
objects in a vector. Notice the exception specification on the addTest( )
member function. When you add a test to a suite, Suite::addTest( )
verifies that the pointer you pass is not null; if it is null, it throws a TestSuiteError
exception. Since this makes it impossible to add a null pointer to a suite, addSuite( )
asserts this condition on each of its tests, as do the other functions that
traverse the vector of tests (see the following implementation). Copy
and assignment are disallowed as they are in the Test class.
//: TestSuite:Suite.cpp {O}
#include "Suite.h"
#include <iostream>
#include <cassert>
#include <cstddef>
using namespace std;
using namespace TestSuite;
void Suite::addTest(Test* t) throw(TestSuiteError) {
// Verify test is valid and has a stream:
if(t == 0)
throw TestSuiteError("Null test in
Suite::addTest");
else if(osptr && !t->getStream())
t->setStream(osptr);
tests.push_back(t);
t->reset();
}
void Suite::addSuite(const Suite& s) {
for(size_t i = 0; i <
s.tests.size(); ++i) {
assert(tests[i]);
addTest(s.tests[i]);
}
}
void Suite::free() {
for(size_t i = 0; i < tests.size(); ++i) {
delete tests[i];
tests[i] = 0;
}
}
void Suite::run() {
reset();
for(size_t i = 0; i < tests.size(); ++i) {
assert(tests[i]);
tests[i]->run();
}
}
long Suite::report() const {
if(osptr) {
long totFail = 0;
*osptr << "Suite \"" <<
name
<< "\"\n=======";
size_t i;
for(i = 0; i < name.size(); ++i)
*osptr << '=';
*osptr << "="
<< endl;
for(i = 0; i <
tests.size(); ++i) {
assert(tests[i]);
totFail += tests[i]->report();
}
*osptr << "=======";
for(i = 0; i < name.size(); ++i)
*osptr << '=';
*osptr << "="
<< endl;
return totFail;
}
else
return getNumFailed();
}
long Suite::getNumPassed() const {
long totPass = 0;
for(size_t i = 0; i < tests.size(); ++i) {
assert(tests[i]);
totPass += tests[i]->getNumPassed();
}
return totPass;
}
long Suite::getNumFailed() const {
long totFail = 0;
for(size_t i = 0; i < tests.size(); ++i) {
assert(tests[i]);
totFail += tests[i]->getNumFailed();
}
return totFail;
}
void Suite::reset() {
for(size_t i = 0; i < tests.size(); ++i) {
assert(tests[i]);
tests[i]->reset();
}
} ///:~
We will be using the TestSuite framework wherever it
applies throughout the rest of this book.
The best debugging habit is to use assertions as explained
in the beginning of this chapter; by doing so you’ll help find logic errors
before they cause real trouble. This section contains some other tips and
techniques that might help during debugging.
Sometimes it’s useful to print the code of each statement as
it is executed, either to cout or to a trace file. Here’s a preprocessor
macro to accomplish this:
#define TRACE(ARG) cout << #ARG << endl; ARG
Now you can go through and surround the statements you trace
with this macro. However, this can introduce problems. For example, if you take
the statement:
for(int i = 0; i < 100; i++)
cout << i << endl;
and put both lines inside TRACE( ) macros, you
get this:
TRACE(for(int i = 0; i < 100; i++))
TRACE( cout << i << endl;)
which expands to this:
cout << "for(int i = 0; i < 100;
i++)" << endl;
for(int i = 0; i < 100; i++)
cout << "cout << i <<
endl;" << endl;
cout << i << endl;
which isn’t exactly what you want. Thus, you must use this
technique carefully.
The following is a variation on the TRACE( )
macro:
#define D(a) cout << #a "=[" << a <<
"]" << endl;
If you want to display an expression, you simply put it
inside a call to D( ). The expression is displayed, followed by its
value (assuming there’s an overloaded operator << for the result
type). For example, you can say D(a + b). You can use this macro any
time you want to check an intermediate value.
These two macros represent the two most fundamental things
you do with a debugger: trace through the code execution and display values. A
good debugger is an excellent productivity tool, but sometimes debuggers are
not available, or it’s not convenient to use them. These techniques always
work, regardless of the situation.
DISCLAIMER: This section and the next contain code which is
officially unsanctioned by the C++ Standard. In particular, we redefine cout
and new via macros, which can cause surprising results if you’re not
careful. Our examples work on all the compilers we use, however, and provide
useful information. This is the only place in this book where we will depart
from the sanctity of standard-compliant coding practice. Use at your own risk!
Note that in order for this to work, a using-declaration must be used, so that cout
isn’t prefixed by its namespace, i.e. std::cout will not work.
The following code easily creates a trace file and sends all
the output that would normally go to cout into that file. All you must
do is #define TRACEON and include the header file (of course, it’s
fairly easy just to write the two key lines right into your file):
//: C03:Trace.h
// Creating a trace file.
#ifndef TRACE_H
#define TRACE_H
#include <fstream>
#ifdef TRACEON
std::ofstream TRACEFILE__("TRACE.OUT");
#define cout TRACEFILE__
#endif
#endif // TRACE_H ///:~
Here’s a simple test of the previous file:
//: C03:Tracetst.cpp {-bor}
#include <iostream>
#include <fstream>
#include "../require.h"
using namespace std;
#define TRACEON
#include "Trace.h"
int main() {
ifstream
f("Tracetst.cpp");
assure(f, "Tracetst.cpp");
cout << f.rdbuf(); // Dumps file contents to
file
} ///:~
Because cout has been textually turned into something
else by Trace.h, all the cout statements in your program now send
information to the trace file. This is a convenient way of capturing your
output into a file, in case your operating system doesn’t make output
redirection easy.
The following straightforward debugging techniques are
explained in Volume 1:
1. For
array bounds checking, use the Array template in C16:Array3.cpp
of Volume 1 for all arrays. You can turn off the checking and increase
efficiency when you’re ready to ship. (Although this doesn’t deal with the case
of taking a pointer to an array.)
2. Check
for non-virtual destructors in base classes.
Tracking new/delete and malloc/free
Common problems with memory allocation include mistakenly
calling delete for memory that’s not on the free store, deleting the
free store more than once, and, most often, forgetting to delete a pointer.
This section discusses a system that can help you track down these kinds of
problems.
As an additional disclaimer beyond that of the
preceding section: because of the way we overload new, the following
technique may not work on all platforms, and will only work for programs that
do not call the function operator new( ) explicitly. We have been quite careful in this book to only present code that fully conforms to the C++
Standard, but in this one instance we’re making an exception for the following
reasons:
1. Even though it’s technically illegal, it works on many compilers.
2. We illustrate some useful thinking along the way.
To use the memory checking system, you simply include the
header file MemCheck.h, link the MemCheck.obj file into your
application to intercept all the calls to new and delete, and
call the macro MEM_ON( ) (explained later in this section) to
initiate memory tracing. A trace of all allocations and deallocations is
printed to the standard output (via stdout). When you use this system,
all calls to new store information about the file and line where they were called. This is accomplished by using the placement syntax for operator
new. Although you
typically use the placement syntax when you need to place objects at a specific
point in memory, it can also create an operator new( ) with any
number of arguments. This is used in the following example to store the results
of the __FILE__ and __LINE__ macros whenever new is
called:
//: C02:MemCheck.h
#ifndef MEMCHECK_H
#define MEMCHECK_H
#include <cstddef> // For size_t
// Usurp the new operator (both scalar and array
versions)
void* operator new(std::size_t, const char*, long);
void* operator new[](std::size_t, const char*, long);
#define new new (__FILE__, __LINE__)
extern bool traceFlag;
#define TRACE_ON() traceFlag = true
#define TRACE_OFF() traceFlag = false
extern bool activeFlag;
#define MEM_ON() activeFlag = true
#define MEM_OFF() activeFlag = false
#endif // MEMCHECK_H ///:~
It is important to include this file in any source file in
which you want to track free store activity, but include it last (after
your other #include directives). Most headers in the standard library
are templates, and since most compilers use the inclusion model of
template compilation (meaning all source code is in the headers), the macro
that replaces new in MemCheck.h would usurp all instances of the new
operator in the library source code (and would likely result in compile
errors). Besides, you are only interested in tracking your own memory errors,
not the library’s.
In the following file, which contains the memory tracking
implementation, everything is done with C standard I/O rather than with C++
iostreams. It shouldn’t make a difference, since we’re not interfering with
iostreams’ use of the free store, but when we tried it, some compilers
complained. All compilers were happy with the <cstdio> version.
//: C02:MemCheck.cpp {O}
#include <cstdio>
#include <cstdlib>
#include <cassert>
#include <cstddef>
using namespace std;
#undef new
// Global flags set by macros in MemCheck.h
bool traceFlag = true;
bool activeFlag = false;
namespace {
// Memory map entry type
struct Info {
void* ptr;
const char* file;
long line;
};
// Memory map data
const size_t MAXPTRS = 10000u;
Info memMap[MAXPTRS];
size_t nptrs = 0;
// Searches the map for an address
int findPtr(void* p) {
for(size_t i = 0; i < nptrs; ++i)
if(memMap[i].ptr == p)
return i;
return -1;
}
void delPtr(void* p) {
int pos = findPtr(p);
assert(pos >= 0);
// Remove pointer from map
for(size_t i = pos; i < nptrs-1; ++i)
memMap[i] = memMap[i+1];
--nptrs;
}
// Dummy type for static destructor
struct Sentinel {
~Sentinel() {
if(nptrs > 0) {
printf("Leaked memory at:\n");
for(size_t i = 0; i < nptrs; ++i)
printf("\t%p (file: %s, line %ld)\n",
memMap[i].ptr, memMap[i].file,
memMap[i].line);
}
else
printf("No user memory leaks!\n");
}
};
// Static dummy object
Sentinel s;
} // End anonymous namespace
// Overload scalar new
void*
operator new(size_t siz, const char* file, long line) {
void* p = malloc(siz);
if(activeFlag) {
if(nptrs == MAXPTRS) {
printf("memory map too small (increase
MAXPTRS)\n");
exit(1);
}
memMap[nptrs].ptr = p;
memMap[nptrs].file = file;
memMap[nptrs].line = line;
++nptrs;
}
if(traceFlag) {
printf("Allocated %u bytes at address %p
", siz, p);
printf("(file: %s, line: %ld)\n", file,
line);
}
return p;
}
// Overload array new
void*
operator new[](size_t siz, const
char* file, long line) {
return operator new(siz, file, line);
}
// Override scalar delete
void operator delete(void* p) {
if(findPtr(p) >= 0) {
free(p);
assert(nptrs > 0);
delPtr(p);
if(traceFlag)
printf("Deleted memory at address
%p\n", p);
}
else if(!p && activeFlag)
printf("Attempt to delete unknown pointer:
%p\n", p);
}
// Override array delete
void operator delete[](void* p) {
operator delete(p);
} ///:~
The Boolean flags traceFlag and activeFlag are
global, so they can be modified in your code by the macros TRACE_ON( ),
TRACE_OFF( ), MEM_ON( ), and MEM_OFF( ). In
general, enclose all the code in your main( ) within a MEM_ON( )-MEM_OFF( )
pair so that memory is always tracked. Tracing, which echoes the activity of
the replacement functions for operator new( ) and operator
delete( ), is on by default, but you can turn it off with TRACE_OFF( ).
In any case, the final results are always printed (see the test runs later in this
chapter).
The MemCheck facility tracks memory by keeping all
addresses allocated by operator new( ) in an array of Info
structures, which also holds the file name and line number where the call to new
occurred. To prevent collision with any names you have placed in the global
namespace, as much information as possible is kept inside the anonymous
namespace. The Sentinel class exists solely to call a static object
destructor as the program shuts down. This destructor inspects memMap to
see if any pointers are waiting to be deleted (indicating a memory leak).
Our operator new( ) uses malloc( )
to get memory, and then adds the pointer and its associated file information to
memMap. The operator delete( ) function undoes all that work
by calling free( ) and decrementing nptrs, but first it
checks to see if the pointer in question is in the map in the first place. If
it isn’t, either you’re trying to delete an address that isn’t on the free
store, or you’re trying to delete one that’s already been deleted and removed
from the map. The activeFlag variable is important here because we don’t
want to process any deallocations from any system shutdown activity. By calling
MEM_OFF( ) at the end of your code, activeFlag will be set
to false, and such subsequent calls to delete will be ignored. (That’s
bad in a real program, but our purpose here is to find your leaks; we’re
not debugging the library.) For simplicity, we forward all work for array new
and delete to their scalar counterparts.
The following is a simple test using the MemCheck
facility:
//: C02:MemTest.cpp
//{L} MemCheck
// Test of MemCheck system.
#include <iostream>
#include <vector>
#include <cstring>
#include "MemCheck.h" // Must appear last!
using namespace std;
class Foo {
char* s;
public:
Foo(const char*s ) {
this->s = new char[strlen(s) + 1];
strcpy(this->s, s);
}
~Foo() { delete [] s; }
};
int main() {
MEM_ON();
cout << "hello" << endl;
int* p = new int;
delete p;
int* q = new int[3];
delete [] q;
int* r;
delete r;
vector<int> v;
v.push_back(1);
Foo s("goodbye");
MEM_OFF();
} ///:~
This example verifies that you can use MemCheck in
the presence of streams, standard containers, and classes that allocate memory
in constructors. The pointers p and q are allocated and
deallocated without any problem, but r is not a valid heap pointer, so
the output indicates the error as an attempt to delete an unknown pointer:
hello
Allocated 4 bytes at address 0xa010778 (file:
memtest.cpp, line: 25)
Deleted memory at address 0xa010778
Allocated 12 bytes at address 0xa010778 (file:
memtest.cpp, line: 27)
Deleted memory at address 0xa010778
Attempt to delete unknown pointer: 0x1
Allocated 8 bytes at address 0xa0108c0 (file:
memtest.cpp, line: 14)
Deleted memory at address 0xa0108c0
No user memory leaks!
Because of the call to MEM_OFF( ), no subsequent
calls to operator delete( ) by vector or ostream are
processed. You still might get some calls to delete from reallocations
performed by the containers.
If you call TRACE_OFF( ) at the beginning of the
program, the output is
hello
Attempt to delete unknown pointer: 0x1
No user memory leaks!
Much of the headache of software engineering can be avoided
by being deliberate about what you’re doing. You’ve probably been using mental
assertions as you’ve crafted your loops and functions, even if you haven’t
routinely used the assert( ) macro. If you’ll use assert( ),
you’ll find logic errors sooner and end up with more readable code as well.
Remember to only use assertions for invariants, though, and not for runtime
error handling.
Nothing will give you more peace of mind than thoroughly
tested code. If it’s been a hassle for you in the past, use an automated
framework, such as the one we’ve presented here, to integrate routine testing
into your daily work. You (and your users!) will be glad you did.
Solutions
to selected exercises can be found in the electronic document The Thinking
in C++ Volume 2 Annotated Solution Guide, available for a small fee from www.MindView.net.
1. Write a test program using the TestSuite Framework for the
standard vector class that thoroughly tests the following member
functions with a vector of integers: push_back( ) (appends
an element to the end of the vector), front( ) (returns the
first element in the vector), back( ) (returns the last
element in the vector), pop_back( ) (removes the last
element without returning it), at( ) (returns the element in a
specified index position), and size( ) (returns the number of
elements). Be sure to verify that vector::at( ) throws a std::out_of_range
exception if the supplied index is out of range.
2. Suppose you are asked to develop a class named Rational
that supports rational numbers (fractions). The fraction in a Rational
object should always be stored in lowest terms, and a denominator of zero is an
error. Here is a sample interface for such a Rational class:
//: C02:Rational.h {-xo}
#ifndef RATIONAL_H
#define RATIONAL_H
#include <iosfwd>
class Rational {
public:
Rational(int numerator = 0, int denominator = 1);
Rational operator-() const;
friend Rational operator+(const Rational&,
const Rational&);
friend Rational operator-(const Rational&,
const Rational&);
friend Rational operator*(const Rational&,
const Rational&);
friend Rational operator/(const Rational&,
const Rational&);
friend std::ostream&
operator<<(std::ostream&, const
Rational&);
friend std::istream&
operator>>(std::istream&, Rational&);
Rational& operator+=(const Rational&);
Rational& operator-=(const Rational&);
Rational& operator*=(const Rational&);
Rational& operator/=(const Rational&);
friend bool operator<(const Rational&,
const Rational&);
friend bool operator>(const Rational&,
const Rational&);
friend bool operator<=(const Rational&,
const Rational&);
friend bool operator>=(const Rational&,
const Rational&);
friend bool operator==(const Rational&,
const Rational&);
friend bool operator!=(const Rational&,
const Rational&);
};
#endif // RATIONAL_H ///:~
Write a complete
specification for this class, including preconditions, postconditions, and
exception specifications.
3. Write a test using the TestSuite framework that thoroughly
tests all the specifications from the previous exercise, including testing
exceptions.
4. Implement the Rational class so that all the tests from
the previous exercise pass. Use assertions only for invariants.
5. The file BuggedSearch.cpp below contains a binary search
function that searches the range [beg, end) for what. There are
some bugs in the algorithm. Use the trace techniques from this chapter to debug
the search function.
//: C02:BuggedSearch.cpp {-xo}
//{L} ../TestSuite/Test
#include <cstdlib>
#include <ctime>
#include <cassert>
#include <fstream>
#include "../TestSuite/Test.h"
using namespace std;
// This function is only one with bugs
int* binarySearch(int* beg, int* end, int what) {
while(end - beg != 1) {
if(*beg == what) return beg;
int mid = (end - beg) / 2;
if(what <= beg[mid]) end = beg + mid;
else beg = beg + mid;
}
return 0;
}
class BinarySearchTest : public TestSuite::Test {
enum { SZ = 10 };
int* data;
int max; // Track largest number
int current; // Current non-contained number
// Used in notContained()
// Find the next number not contained in the array
int notContained() {
while(data[current] + 1 == data[current + 1])
++current;
if(current >= SZ) return max + 1;
int retValue = data[current++] + 1;
return retValue;
}
void setData() {
data = new int[SZ];
assert(!max);
// Input values with increments of one. Leave
// out some values on both odd and even indexes.
for(int i = 0; i < SZ;
rand() % 2 == 0 ? max += 1 : max += 2)
data[i++] = max;
}
void testInBound() {
// Test locations both odd and even
// not contained and contained
for(int i = SZ; --i >=0;)
test_(binarySearch(data, data + SZ, data[i]));
for(int i = notContained(); i < max;
i = notContained())
test_(!binarySearch(data, data + SZ, i));
}
void testOutBounds() {
// Test lower values
for(int i = data[0]; --i > data[0] - 100;)
test_(!binarySearch(data, data + SZ, i));
// Test higher values
for(int i = data[SZ - 1];
++i < data[SZ -1] + 100;)
test_(!binarySearch(data, data + SZ, i));
}
public:
BinarySearchTest() { max = current = 0; }
void run() {
setData();
testInBound();
testOutBounds();
delete [] data;
}
};
int main() {
srand(time(0));
BinarySearchTest t;
t.run();
return t.report();
} ///:~
Standard C++ not only incorporates all the Standard C libraries
(with small additions and changes to support type safety), it also adds
libraries of its own. These libraries are far more powerful than those in
Standard C; the leverage you get from them is analogous to the leverage you get
from changing from C to C++.
This section of the book gives you an in-depth introduction
to key portions of the Standard C++ library.
The most complete and also the most obscure reference to the
full libraries is the Standard itself. Bjarne Stroustrup’s The C++
Programming Language, Third Edition (Addison Wesley, 2000) remains a
reliable reference for both the language and the library. The most celebrated
library-only reference is The C++ Standard Library: A Tutorial and Reference,
by Nicolai Josuttis (Addison Wesley, 1999). The goal of the chapters in this
part of the book is to provide you with an encyclopedia of descriptions and
examples so that you’ll have a good starting point for solving any problem that
requires the use of the Standard libraries. However, some techniques and topics
are rarely used and are not covered here. If you can’t find it in these
chapters, reach for the other two books; this book is not intended to replace
those books but rather to complement them. In particular, we hope that after
going through the material in the following chapters you’ll have a much easier
time understanding those books.
You will notice that these chapters do not contain
exhaustive documentation describing every function and class in the Standard
C++ library. We’ve left the full descriptions to others; in particular to P.J. Plauger’s Dinkumware C/C++ Library Reference at http://www.dinkumware.com.
This is an excellent online source of standard library documentation in HTML format that you can keep resident on your computer and view with a
Web browser whenever you need to look something up. You can view this online or
purchase it for local viewing. It contains complete reference pages for the
both the C and C++ libraries (so it’s good to use for all your Standard C/C++
programming questions). Electronic documentation is effective not only because
you can always have it with you, but also because you can do an electronic
search.
When you’re actively programming, these resources should
satisfy your reference needs (and you can use them to look up anything in this
chapter that isn’t clear to you). Appendix A lists additional references.
The first chapter in this section introduces the Standard
C++ string class, which is a powerful tool that simplifies most of the
text-processing chores you might have. Chances are, anything you’ve done to
character strings with lines of code in C can be done with a member function
call in the string class.
Chapter 4 covers the iostreams library, which
contains classes for processing input and output with files, string targets,
and the system console.
Although Chapter 5, “Templates in Depth,” is not explicitly
a library chapter, it is necessary preparation for the two chapters that
follow. In Chapter 6 we examine the generic algorithms offered by the Standard
C++ library. Because they are implemented with templates, these algorithms can
be applied to any sequence of objects. Chapter 7 covers the standard
containers and their associated iterators. We cover algorithms first because
they can be fully explored by using only arrays and the vector container
(which we have been using since early in Volume 1). It is also natural to use
the standard algorithms in connection with containers, so it’s good to be
familiar with the algorithms before studying the containers.
String processing with character arrays is one of the biggest
time–wasters in C. Character arrays require the programmer to keep track of the
difference between static quoted strings and arrays created on the stack and
the heap, and the fact that sometimes you’re passing around a char* and
sometimes you must copy the whole array.
Especially because string manipulation is so common,
character arrays are a great source of misunderstandings and bugs. Despite
this, creating string classes remained a common exercise for beginning C++ programmers
for many years. The Standard C++ library string class solves the problem of character array manipulation once and for all, keeping track of memory even during
assignments and copy-constructions. You simply don’t need to think about it.
This chapter examines
the Standard C++ string class, beginning with a look at what constitutes
a C++ string and how the C++ version differs from a traditional C character
array. You’ll learn about operations and manipulations using string
objects, and you’ll see how C++ strings accommodate variation in
character sets and string data conversion.
Handling text is one of the oldest programming applications,
so it’s not surprising that the C++ string draws heavily on the ideas and
terminology that have long been used in C and other languages. As you begin to
acquaint yourself with C++ strings, this fact should be reassuring. No
matter which programming idiom you choose, there are three common things you
want to do with a string:
· Create or modify the sequence of characters stored in the string.
· Detect the presence or absence of elements within the string.
· Translate between various schemes for representing string
characters.
You’ll see how each of these jobs is accomplished using C++ string
objects.
In C, a string is simply an array of characters that always
includes a binary zero (often called the null terminator) as its final
array element. There are significant differences between C++ strings and
their C progenitors. First, and most important, C++ strings hide the
physical representation of the sequence of characters they contain. You don’t need
to be concerned about array dimensions or null terminators. A string
also contains certain “housekeeping” information about the size and storage
location of its data. Specifically, a C++ string object knows its
starting location in memory, its content, its length in characters, and the
length in characters to which it can grow before the string object must
resize its internal data buffer. C++ strings thus greatly reduce the likelihood
of making three of the most common and destructive C programming errors:
overwriting array bounds, trying to access arrays through uninitialized or
incorrectly valued pointers, and leaving pointers “dangling” after an array
ceases to occupy the storage that was once allocated to it.
The exact implementation of memory layout for the string
class is not defined by the C++ Standard. This architecture is intended to be
flexible enough to allow differing implementations by compiler vendors, yet
guarantee predictable behavior for users. In particular, the exact conditions
under which storage is allocated to hold data for a string object are not
defined. String allocation rules were formulated to allow but not require a
reference-counted implementation, but whether or not the implementation uses reference counting, the semantics must be the same. To put this a bit differently,
in C, every char array occupies a unique physical region of memory. In
C++, individual string objects may or may not occupy unique physical
regions of memory, but if reference counting avoids storing duplicate copies of
data, the individual objects must look and act as though they exclusively own unique
regions of storage. For example:
//: C03:StringStorage.h
#ifndef STRINGSTORAGE_H
#define STRINGSTORAGE_H
#include <iostream>
#include <string>
#include "../TestSuite/Test.h"
using std::cout;
using std::endl;
using std::string;
class StringStorageTest : public TestSuite::Test {
public:
void run() {
string s1("12345");
// This may copy the first to the second or
// use reference counting to simulate a copy:
string s2 = s1;
test_(s1 == s2);
// Either way, this statement must ONLY modify s1:
s1[0] = '6';
cout << "s1 = " << s1
<< endl; // 62345
cout << "s2 = " << s2
<< endl; // 12345
test_(s1 != s2);
}
};
#endif //
STRINGSTORAGE_H ///:~
//: C03:StringStorage.cpp
//{L} ../TestSuite/Test
#include "StringStorage.h"
int main() {
StringStorageTest t;
t.run();
return t.report();
} ///:~
We say that an implementation that only makes unique copies
when a string is modified uses a copy-on-write strategy. This approach
saves time and space when strings are used only as value parameters or in other
read-only situations.
Whether a library implementation uses reference counting or
not should be transparent to users of the string class. Unfortunately,
this is not always the case. In multithreaded programs, it is practically
impossible to use a reference-counting implementation safely.
Creating and initializing strings is a straightforward
proposition and fairly flexible. In the SmallString.cpp example below,
the first string, imBlank, is declared but contains no initial
value. Unlike a C char array, which would contain a random and
meaningless bit pattern until initialization, imBlank does contain
meaningful information. This string object is initialized to hold “no
characters” and can properly report its zero length and absence of data
elements using class member functions.
The next string, heyMom, is initialized by the
literal argument “Where are my socks?” This form of initialization uses a
quoted character array as a parameter to the string constructor. By
contrast, standardReply is simply initialized with an assignment. The
last string of the group, useThisOneAgain, is initialized using an
existing C++ string object. Put another way, this example illustrates
that string objects let you do the following:
· Create an empty string and defer initializing it with
character data.
· Initialize a string by passing a literal, quoted character
array as an argument to the constructor.
· Initialize a string using the equal sign (=).
· Use one string to initialize another.
//: C03:SmallString.cpp
#include <string>
using namespace std;
int main() {
string imBlank;
string heyMom("Where are my socks?");
string standardReply = "Beamed into deep "
"space on wide angle dispersion?";
string useThisOneAgain(standardReply);
} ///:~
These are the simplest forms of string
initialization, but variations offer more flexibility and control. You can do
the following:
· Use a portion of either a C char array or a C++ string.
· Combine different sources of initialization data using operator+.
· Use the string object’s substr( ) member function to create a substring.
Here’s a program that illustrates
these features:
//: C03:SmallString2.cpp
#include <string>
#include <iostream>
using namespace std;
int main() {
string s1("What is the sound of one clam
napping?");
string s2("Anything worth doing is worth
overdoing.");
string s3("I saw Elvis in a UFO");
// Copy the first 8 chars:
string s4(s1, 0, 8);
cout << s4 << endl;
// Copy 6 chars from the middle of the source:
string s5(s2, 15, 6);
cout << s5 << endl;
// Copy from middle to end:
string s6(s3, 6, 15);
cout << s6 << endl;
// Copy many different things:
string quoteMe = s4 + "that" +
// substr() copies 10 chars at element 20
s1.substr(20, 10) + s5 +
// substr() copies up to either 100 char
// or eos starting at element 5
"with" + s3.substr(5, 100) +
// OK to copy a single char this way
s1.substr(37, 1);
cout << quoteMe << endl;
} ///:~
The string member function substr( )
takes a starting position as its first argument and the number of characters to
select as the second argument. Both arguments have default values. If you say substr( )
with an empty argument list, you produce a copy of the entire string, so
this is a convenient way to duplicate a string.
Here’s the output from the program:
What is
doing
Elvis in a UFO
What is that one clam doing
with Elvis in a UFO?
Notice the final line of the example. C++ allows string
initialization techniques to be mixed in a single statement, a flexible and
convenient feature. Also notice that the last initializer copies just one
character from the source string.
Another slightly more subtle initialization technique
involves the use of the string iterators string::begin( )
and string::end( ). This technique treats a string like a container
object (which you’ve seen primarily in the form of vector so far—you’ll
see many more containers in Chapter 7), which uses iterators to indicate
the start and end of a sequence of characters. In this way you can hand a string
constructor two iterators, and it copies from one to the other into the new string:
//: C03:StringIterators.cpp
#include <string>
#include <iostream>
#include <cassert>
using namespace std;
int main() {
string source("xxx");
string s(source.begin(), source.end());
assert(s == source);
} ///:~
The iterators are not restricted to begin( ) and
end( ); you can increment, decrement, and add integer offsets to
them, allowing you to extract a subset of characters from the source string.
C++ strings may not be initialized with single
characters or with ASCII or other integer values. You can initialize a string
with a number of copies of a single character, however:
//: C03:UhOh.cpp
#include <string>
#include <cassert>
using namespace std;
int main() {
// Error: no single char inits
//! string nothingDoing1('a');
// Error: no integer inits
//! string nothingDoing2(0x37);
// The following is legal:
string okay(5, 'a');
assert(okay == string("aaaaa"));
} ///:~
The first argument indicates the number of copies of the
second argument to place in the string. The second argument can only be a
single char, not a char array.
If you’ve programmed in C, you are accustomed to the family
of functions that write, search, modify, and copy char arrays. There are
two unfortunate aspects of the Standard C library functions for handling char
arrays. First, there are two loosely organized families of them: the “plain”
group, and the ones that require you to supply a count of the number of
characters to be considered in the operation at hand. The roster of functions
in the C char array library shocks the unsuspecting user with a long
list of cryptic, mostly unpronounceable names. Although the type and number of
arguments to the functions are somewhat consistent, to use them properly you
must be attentive to details of function naming and parameter passing.
The second inherent trap of the standard C char array
tools is that they all rely explicitly on the assumption that the character
array includes a null terminator. If by oversight or error the null is omitted
or overwritten, there’s little to keep the C char array functions from
manipulating the memory beyond the limits of the allocated space, sometimes
with disastrous results.
C++ provides a vast improvement in the convenience and
safety of string objects. For purposes of actual string handling
operations, there are about the same number of distinct member function names
in the string class as there are functions in the C library, but because
of overloading the functionality is much greater. Coupled with sensible naming
practices and the judicious use of default arguments, these features combine to
make the string class much easier to use than the C library char
array functions.
One of the most valuable and convenient aspects of C++ strings
is that they grow as needed, without intervention on the part of the
programmer. Not only does this make string-handling code inherently more
trustworthy, it also almost entirely eliminates a tedious “housekeeping”
chore—keeping track of the bounds of the storage where your strings live. For
example, if you create a string object and initialize it with a string of 50
copies of ‘X’, and later store in it 50 copies of “Zowie”, the object itself
will reallocate sufficient storage to accommodate the growth of the data.
Perhaps nowhere is this property more appreciated than when the strings
manipulated in your code change size and you don’t know how big the change is. The
string member functions append( ) and insert( )
transparently reallocate storage when a string grows:
//: C03:StrSize.cpp
#include <string>
#include <iostream>
using namespace std;
int main() {
string bigNews("I saw Elvis in a UFO. ");
cout << bigNews << endl;
// How much data have we actually got?
cout << "Size = " <<
bigNews.size() << endl;
// How much can we store without reallocating?
cout << "Capacity = " <<
bigNews.capacity() << endl;
// Insert this string in bigNews immediately
// before bigNews[1]:
bigNews.insert(1, " thought I");
cout << bigNews << endl;
cout << "Size = " <<
bigNews.size() << endl;
cout << "Capacity = " <<
bigNews.capacity() << endl;
// Make sure that there will be this much space
bigNews.reserve(500);
// Add this to the end of the string:
bigNews.append("I've been working too
hard.");
cout << bigNews << endl;
cout << "Size = " <<
bigNews.size() << endl;
cout << "Capacity = " <<
bigNews.capacity() << endl;
} ///:~
Here is the output from one particular compiler:
I saw Elvis in a UFO.
Size = 22
Capacity = 31
I thought I saw Elvis in a UFO.
Size = 32
Capacity = 47
I thought I saw Elvis in a UFO. I've been
working too hard.
Size = 59
Capacity = 511
This example demonstrates that even though you can safely
relinquish much of the responsibility for allocating and managing the memory
your strings occupy, C++ strings provide you with several tools
to monitor and manage their size. Notice the ease with which we changed the
size of the storage allocated to the string. The size( ) function returns the number of characters currently stored in the string and is identical to the length( ) member function. The capacity( ) function returns
the size of the current underlying allocation, meaning the number of characters
the string can hold without requesting more storage. The reserve( )
function is an optimization mechanism that indicates your intention to specify
a certain amount of storage for future use; capacity( ) always
returns a value at least as large as the most recent call to reserve( ).
A resize( ) function appends spaces if the new size is greater than
the current string size or truncates the string otherwise. (An overload of resize( )
can specify a different character to append.)
The exact fashion that the string member functions
allocate space for your data depends on the implementation of the library. When
we tested one implementation with the previous example, it appeared that
reallocations occurred on even word (that is, full-integer) boundaries, with
one byte held back. The architects of the string class have endeavored
to make it possible to mix the use of C char arrays and C++ string
objects, so it is likely that figures reported by StrSize.cpp for capacity
reflect that, in this particular implementation, a byte is set aside to easily
accommodate the insertion of a null terminator.
The insert( ) function is particularly
nice because it absolves you from making sure the insertion of characters in a
string won’t overrun the storage space or overwrite the characters immediately
following the insertion point. Space grows, and existing characters politely
move over to accommodate the new elements. Sometimes this might not be what you
want. If you want the size of the string to remain unchanged, use the replace( ) function to overwrite characters. There are a number of
overloaded versions of replace( ), but the simplest one takes three
arguments: an integer indicating where to start in the string, an integer
indicating how many characters to eliminate from the original string, and the
replacement string (which can be a different number of characters than the
eliminated quantity). Here’s a simple example:
//: C03:StringReplace.cpp
// Simple find-and-replace in strings.
#include <cassert>
#include <string>
using namespace std;
int main() {
string s("A piece of text");
string tag("$tag$");
s.insert(8, tag + ' ');
assert(s == "A piece $tag$
of text");
int start = s.find(tag);
assert(start == 8);
assert(tag.size() == 5);
s.replace(start, tag.size(), "hello
there");
assert(s == "A piece hello there of text");
} ///:~
The tag is first inserted into s (notice that
the insert happens before the value indicating the insert point and that
an extra space was added after tag), and then it is found and replaced.
You should check to see if you’ve found anything before you
perform a replace( ). The previous example replaces with a char*,
but there’s an overloaded version that replaces with a string. Here’s
a more complete demonstration replace( ):
//: C03:Replace.cpp
#include <cassert>
#include <cstddef> // For size_t
#include <string>
using namespace std;
void replaceChars(string& modifyMe,
const string& findMe, const string& newChars)
{
// Look in modifyMe for the "find string"
// starting at position 0:
size_t i = modifyMe.find(findMe, 0);
// Did we find the string to replace?
if(i != string::npos)
// Replace the find string with newChars:
modifyMe.replace(i, findMe.size(), newChars);
}
int main() {
string bigNews = "I thought I saw Elvis in a
UFO. "
"I have been working too
hard.";
string replacement("wig");
string findMe("UFO");
// Find "UFO" in bigNews and overwrite it:
replaceChars(bigNews, findMe, replacement);
assert(bigNews == "I thought I saw Elvis in a
"
"wig. I have been working too
hard.");
} ///:~
If replace doesn’t find the search string, it returns
string::npos. The npos data member is a static constant member of
the string class that represents a nonexistent character position.
Unlike insert( ), replace( ) won’t
grow the string’s storage space if you copy new characters into the
middle of an existing series of array elements. However, it will grow the storage space if needed, for example, when you make a “replacement” that would
expand the original string beyond the end of the current allocation. Here’s an
example:
//: C03:ReplaceAndGrow.cpp
#include <cassert>
#include <string>
using namespace std;
int main() {
string bigNews("I have been working the
grave.");
string replacement("yard shift.");
// The first argument says "replace chars
// beyond the end of the existing string":
bigNews.replace(bigNews.size() - 1,
replacement.size(), replacement);
assert(bigNews == "I have been working the
"
"graveyard shift.");
} ///:~
The call to replace( ) begins “replacing” beyond
the end of the existing array, which is equivalent to an append operation.
Notice that in this example replace( ) expands the array
accordingly.
You may have been hunting through this chapter trying to do
something relatively simple such as replace all the instances of one character
with a different character. Upon finding the previous material on replacing,
you thought you found the answer, but then you started seeing groups of
characters and counts and other things that looked a bit too complex. Doesn’t string
have a way to just replace one character with another everywhere?
You can easily write such a function using the find( )
and replace( ) member functions as follows:
//: C03:ReplaceAll.h
#ifndef REPLACEALL_H
#define REPLACEALL_H
#include <string>
std::string& replaceAll(std::string& context,
const std::string& from, const std::string&
to);
#endif // REPLACEALL_H ///:~
//: C03:ReplaceAll.cpp {O}
#include <cstddef>
#include "ReplaceAll.h"
using namespace std;
string& replaceAll(string& context, const
string& from,
const string& to) {
size_t lookHere = 0;
size_t foundHere;
while((foundHere = context.find(from, lookHere))
!= string::npos) {
context.replace(foundHere, from.size(), to);
lookHere = foundHere + to.size();
}
return context;
} ///:~
The version of find( ) used here takes as a
second argument the position to start looking in and returns string::npos
if it doesn’t find it. It is important to advance the position held in the
variable lookHere past the replacement string, in case from is a
substring of to. The following program tests the replaceAll
function:
//: C03:ReplaceAllTest.cpp
//{L} ReplaceAll
#include <cassert>
#include <iostream>
#include <string>
#include "ReplaceAll.h"
using namespace std;
int main() {
string text = "a man, a plan, a canal, Panama";
replaceAll(text, "an", "XXX");
assert(text == "a mXXX, a plXXX, a cXXXal, PXXXama");
} ///:~
As you can see, the string class by itself doesn’t
solve all possible problems. Many solutions have been left to the algorithms in
the Standard library because
the string class can look just like an STL sequence (by virtue of the
iterators discussed earlier). All the generic algorithms work on a “range” of
elements within a container. Usually that range is just “from the beginning of
the container to the end.” A string object looks like a container of
characters: to get the beginning of the range you use string::begin( ),
and to get the end of the range you use string::end( ). The
following example shows the use of the replace( ) algorithm to
replace all the instances of the single character ‘X’ with ‘Y’:
//: C03:StringCharReplace.cpp
#include <algorithm>
#include <cassert>
#include <string>
using namespace std;
int main() {
string s("aaaXaaaXXaaXXXaXXXXaaa");
replace(s.begin(), s.end(), 'X', 'Y');
assert(s == "aaaYaaaYYaaYYYaYYYYaaa");
} ///:~
Notice that this replace( ) is not called
as a member function of string. Also, unlike the string::replace( )
functions that only perform one replacement, the replace( )
algorithm replaces all instances of one character with another.
The replace( ) algorithm only works with single
objects (in this case, char objects) and will not replace quoted char
arrays or string objects. Since a string behaves like an STL
sequence, a number of other algorithms can be applied to it, which might solve
other problems that are not directly addressed by the string member
functions.
One of the most delightful discoveries awaiting a C
programmer learning about C++ string handling is how simply strings
can be combined and appended using operator+ and operator+=. These
operators make combining strings syntactically similar to adding numeric
data:
//: C03:AddStrings.cpp
#include <string>
#include <cassert>
using namespace std;
int main() {
string s1("This ");
string s2("That ");
string s3("The other ");
// operator+ concatenates strings
s1 = s1 + s2;
assert(s1 == "This That ");
// Another way to concatenates strings
s1 += s3;
assert(s1 == "This That The other ");
// You can index the string on the right
s1 += s3 + s3[4] + "ooh lala";
assert(s1 == "This That The other The other oooh
lala");
} ///:~
Using the operator+ and operator+= operators
is a flexible and convenient way to combine string data. On
the right side of the statement, you can use almost any type that evaluates to
a group of one or more characters.
The find family of string member functions
locates a character or group of characters within a given string. Here are the
members of the find family and their general usage :
|
string find member function
|
What/how it finds
|
|
find( )
|
Searches a string for a specified character or group of
characters and returns the starting position of the first occurrence found or
npos if no match is found.
|
|
find_first_of( )
|
Searches a target string and returns the position of the
first match of any character in a specified group. If no match is
found, it returns npos.
|
|
find_last_of( )
|
Searches a target string and returns the position of the
last match of any character in a specified group. If no match is
found, it returns npos.
|
|
find_first_not_of( )
|
Searches a target string and returns the position of the
first element that doesn’t match any character in a specified
group. If no such element is found, it returns npos.
|
|
find_last_not_of( )
|
Searches a target string and returns the position of the
element with the largest subscript that doesn’t match any
character in a specified group. If no such element is found, it returns npos.
|
|
rfind( )
|
Searches a string from end to beginning for a specified
character or group of characters and returns the starting position of the
match if one is found. If no match is found, it returns npos.
|
The simplest use of find( )
searches for one or more characters in a string. This overloaded
version of find( ) takes a parameter that specifies the
character(s) for which to search and optionally a parameter that tells it where
in the string to begin searching for the occurrence of a substring. (The default
position at which to begin searching is 0.) By setting the call to find inside
a loop, you can easily move through a string, repeating a search to find all
the occurrences of a given character or group of characters within the string.
The following program uses the method of The Sieve of
Eratosthenes to find prime numbers less than 50. This method starts with
the number 2, marks all subsequent multiples of 2 as not prime, and repeats the
process for the next prime candidate. The SieveTest constructor
initializes sieveChars by setting the initial size of the character
array and writing the value ‘P’ to each of its members.
//: C03:Sieve.h
#ifndef SIEVE_H
#define SIEVE_H
#include <cmath>
#include <cstddef>
#include <string>
#include "../TestSuite/Test.h"
using std::size_t;
using std::sqrt;
using std::string;
class SieveTest : public TestSuite::Test {
string sieveChars;
public:
// Create a 50 char string and set each
// element to 'P' for Prime:
SieveTest() : sieveChars(50, 'P') {}
void run() {
findPrimes();
testPrimes();
}
bool isPrime(int p) {
if(p == 0 || p == 1) return false;
int root = int(sqrt(double(p)));
for(int i = 2; i <= root; ++i)
if(p % i == 0) return false;
return true;
}
void findPrimes() {
// By definition neither 0 nor 1 is prime.
// Change these elements to "N" for Not
Prime:
sieveChars.replace(0, 2, "NN");
// Walk through the array:
size_t sieveSize = sieveChars.size();
int root = int(sqrt(double(sieveSize)));
for(int i = 2; i <= root; ++i)
// Find all the multiples:
for(size_t factor = 2; factor * i < sieveSize;
++factor)
sieveChars[factor * i] = 'N';
}
void testPrimes() {
size_t i = sieveChars.find('P');
while(i != string::npos) {
test_(isPrime(i++));
i = sieveChars.find('P', i);
}
i = sieveChars.find_first_not_of('P');
while(i != string::npos) {
test_(!isPrime(i++));
i = sieveChars.find_first_not_of('P', i);
}
}
};
#endif // SIEVE_H ///:~
//: C03:Sieve.cpp
//{L} ../TestSuite/Test
#include "Sieve.h"
int main() {
SieveTest t;
t.run();
return t.report();
} ///:~
The find( ) function can walk forward through a string,
detecting multiple occurrences of a character or a group of characters, and find_first_not_of( )
finds other characters or substrings.
There are no functions in the string class to change
the case of a string, but you can easily create these functions using the
Standard C library functions toupper( ) and tolower( ),
which change the case of one character at a time. The following example
illustrates a case-insensitive search:
//: C03:Find.h
#ifndef FIND_H
#define FIND_H
#include <cctype>
#include <cstddef>
#include <string>
#include "../TestSuite/Test.h"
using std::size_t;
using std::string;
using std::tolower;
using std::toupper;
// Make an uppercase copy of s
inline string upperCase(const string& s) {
string upper(s);
for(size_t i = 0; i < s.length(); ++i)
upper[i] = toupper(upper[i]);
return upper;
}
// Make a lowercase copy of s
inline string lowerCase(const string& s) {
string lower(s);
for(size_t i = 0; i < s.length(); ++i)
lower[i] = tolower(lower[i]);
return lower;
}
class FindTest : public TestSuite::Test {
string chooseOne;
public:
FindTest() : chooseOne("Eenie, Meenie, Miney,
Mo") {}
void testUpper() {
string upper = upperCase(chooseOne);
const string LOWER =
"abcdefghijklmnopqrstuvwxyz";
test_(upper.find_first_of(LOWER) == string::npos);
}
void testLower() {
string lower = lowerCase(chooseOne);
const string UPPER =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
test_(lower.find_first_of(UPPER) == string::npos);
}
void testSearch() {
// Case sensitive search
size_t i = chooseOne.find("een");
test_(i == 8);
// Search lowercase:
string test = lowerCase(chooseOne);
i = test.find("een");
test_(i == 0);
i = test.find("een", ++i);
test_(i == 8);
i = test.find("een", ++i);
test_(i == string::npos);
// Search uppercase:
test = upperCase(chooseOne);
i = test.find("EEN");
test_(i == 0);
i = test.find("EEN", ++i);
test_(i == 8);
i = test.find("EEN", ++i);
test_(i == string::npos);
}
void run() {
testUpper();
testLower();
testSearch();
}
};
#endif // FIND_H ///:~
//: C03:Find.cpp
//{L} ../TestSuite/Test
#include "Find.h"
#include "../TestSuite/Test.h"
int main() {
FindTest t;
t.run();
return t.report();
} ///:~
Both the upperCase( ) and lowerCase( )
functions follow the same form: they make a copy of the argument string
and change the case. The Find.cpp program isn’t the best solution to the
case-sensitivity problem, so we’ll revisit it when we examine string
comparisons.
If you need to search through a string from end to
beginning (to find the data in “last in / first out” order), you can use the
string member function rfind( ):
//: C03:Rparse.h
#ifndef RPARSE_H
#define RPARSE_H
#include <cstddef>
#include <string>
#include <vector>
#include "../TestSuite/Test.h"
using std::size_t;
using std::string;
using std::vector;
class RparseTest : public TestSuite::Test {
// To store the words:
vector<string> strings;
public:
void parseForData() {
// The ';' characters will be delimiters
string
s("now.;sense;make;to;going;is;This");
// The last element of the string:
int last = s.size();
// The beginning of the current word:
size_t current = s.rfind(';');
// Walk backward through the string:
while(current != string::npos) {
// Push each word into the vector.
// Current is incremented before copying
// to avoid copying the delimiter:
++current;
strings.push_back(s.substr(current, last - current));
// Back over the delimiter we just found,
// and set last to the end of the next word:
current -= 2;
last = current + 1;
// Find the next delimiter:
current = s.rfind(';', current);
}
// Pick up the first word -- it's not
// preceded by a delimiter:
strings.push_back(s.substr(0, last));
}
void testData() {
// Test them in the new order:
test_(strings[0] == "This");
test_(strings[1] == "is");
test_(strings[2] == "going");
test_(strings[3] == "to");
test_(strings[4] == "make");
test_(strings[5] == "sense");
test_(strings[6] == "now.");
string sentence;
for(size_t i = 0; i < strings.size() - 1; i++)
sentence += strings[i] += " ";
// Manually put last word in to avoid an extra
space:
sentence += strings[strings.size() - 1];
test_(sentence == "This is going to make sense
now.");
}
void run() {
parseForData();
testData();
}
};
#endif // RPARSE_H ///:~
//: C03:Rparse.cpp
//{L} ../TestSuite/Test
#include "Rparse.h"
int main() {
RparseTest t;
t.run();
return t.report();
} ///:~
The string member function rfind( ) backs
through the string looking for tokens and reports the array index of matching
characters or string::npos if it is unsuccessful.
The find_first_of( ) and find_last_of( )
member functions can be conveniently put to work to create a little utility
that will strip whitespace characters from both ends of a string. Notice that
it doesn’t touch the original string, but instead returns a new string:
//: C03:Trim.h
// General tool to strip spaces from both ends.
#ifndef TRIM_H
#define TRIM_H
#include <string>
#include <cstddef>
inline std::string trim(const std::string& s) {
if(s.length() == 0)
return s;
std::size_t beg = s.find_first_not_of("
\a\b\f\n\r\t\v");
std::size_t end = s.find_last_not_of("
\a\b\f\n\r\t\v");
if(beg == std::string::npos) // No non-spaces
return "";
return std::string(s, beg, end - beg + 1);
}
#endif // TRIM_H ///:~
The first test checks for an empty string; in that
case, no tests are made, and a copy is returned. Notice that once the end
points are found, the string constructor builds a new string from
the old one, giving the starting count and the length.
Testing such a general-purpose tool needs to be thorough:
//: C03:TrimTest.h
#ifndef TRIMTEST_H
#define TRIMTEST_H
#include "Trim.h"
#include "../TestSuite/Test.h"
class TrimTest : public TestSuite::Test {
enum {NTESTS = 11};
static std::string s[NTESTS];
public:
void testTrim() {
test_(trim(s[0]) == "abcdefghijklmnop");
test_(trim(s[1]) == "abcdefghijklmnop");
test_(trim(s[2]) == "abcdefghijklmnop");
test_(trim(s[3]) == "a");
test_(trim(s[4]) == "ab");
test_(trim(s[5]) == "abc");
test_(trim(s[6]) == "a b c");
test_(trim(s[7]) == "a b c");
test_(trim(s[8]) == "a \t b \t c");
test_(trim(s[9]) == "");
test_(trim(s[10]) == "");
}
void run() {
testTrim();
}
};
#endif // TRIMTEST_H ///:~
//: C03:TrimTest.cpp {O}
#include "TrimTest.h"
// Initialize static data
std::string TrimTest::s[TrimTest::NTESTS] = {
" \t abcdefghijklmnop \t ",
"abcdefghijklmnop \t ",
" \t abcdefghijklmnop",
"a", "ab", "abc",
"a b c",
" \t a b c \t ", " \t a \t b \t c \t
",
"\t \n \r \v \f",
"" // Must also test the empty string
}; ///:~
//: C03:TrimTestMain.cpp
//{L} ../TestSuite/Test TrimTest
#include "TrimTest.h"
int main() {
TrimTest t;
t.run();
return t.report();
} ///:~
In the array of strings, you can see that the
character arrays are automatically converted to string objects. This
array provides cases to check the removal of spaces and tabs from both ends, as
well as ensuring that spaces and tabs are not removed from the middle of a string.
Removing characters is easy and efficient with the erase( ) member function, which takes two arguments: where to start
removing characters (which defaults to 0), and how many to remove (which
defaults to string::npos). If you specify more characters than remain in
the string, the remaining characters are all erased anyway (so calling erase( )
without any arguments removes all characters from a string). Sometimes it’s
useful to take an HTML file and strip its tags and special characters so that
you have something approximating the text that would be displayed in the Web
browser, only as a plain text file. The following example uses erase( )
to do the job:
//: C03:HTMLStripper.cpp {RunByHand}
//{L} ReplaceAll
// Filter to remove html tags and markers.
#include <cassert>
#include <cmath>
#include <cstddef>
#include <fstream>
#include <iostream>
#include <string>
#include "ReplaceAll.h"
#include "../require.h"
using namespace std;
string& stripHTMLTags(string& s) {
static bool inTag = false;
bool done = false;
while(!done) {
if(inTag) {
// The previous line started an HTML tag
// but didn't finish. Must search for '>'.
size_t rightPos = s.find('>');
if(rightPos != string::npos) {
inTag = false;
s.erase(0, rightPos + 1);
}
else {
done = true;
s.erase();
}
}
else {
// Look for start of tag:
size_t leftPos = s.find('<');
if(leftPos != string::npos) {
// See if tag close is in this line:
size_t rightPos = s.find('>');
if(rightPos == string::npos) {
inTag = done = true;
s.erase(leftPos);
}
else
s.erase(leftPos, rightPos - leftPos + 1);
}
else
done = true;
}
}
// Remove all special HTML characters
replaceAll(s, "<",
"<");
replaceAll(s, ">",
">");
replaceAll(s, "&",
"&");
replaceAll(s, " ", " ");
// Etc...
return s;
}
int main(int argc, char* argv[]) {
requireArgs(argc, 1,
"usage: HTMLStripper InputFile");
ifstream in(argv[1]);
assure(in, argv[1]);
string s;
while(getline(in, s))
if(!stripHTMLTags(s).empty())
cout << s << endl;
} ///:~
This example will even strip HTML tags that span multiple
lines. This is
accomplished with the static flag, inTag, which is true whenever
the start of a tag is found, but the accompanying tag end is not found in the
same line. All forms of erase( ) appear in the stripHTMLFlags( )
function. The
version of getline( ) we use here is a (global) function declared
in the <string> header and is handy because it stores an
arbitrarily long line in its string argument. You don’t need to worry
about the dimension of a character array as you do with istream::getline( ).
Notice that this program uses the replaceAll( ) function from
earlier in this chapter. In the next chapter, we’ll use string streams to
create a more elegant solution.
Comparing strings is inherently different from comparing
numbers. Numbers have constant, universally meaningful values. To evaluate the
relationship between the magnitudes of two strings, you must make a lexical
comparison. Lexical comparison means that when you test a character to see
if it is “greater than” or “less than” another character, you are actually
comparing the numeric representation of those characters as specified in the
collating sequence of the character set being used. Most often this will be the
ASCII collating sequence, which assigns the printable characters for the
English language numbers in the range 32 through 127 decimal. In the ASCII
collating sequence, the first “character” in the list is the space, followed by
several common punctuation marks, and then uppercase and lowercase letters.
With respect to the alphabet, this means that the letters nearer the front have
lower ASCII values than those nearer the end. With these details in mind, it
becomes easier to remember that when a lexical comparison that reports s1
is “greater than” s2, it simply means that when the two were compared,
the first differing character in s1 came later in the alphabet than the
character in that same position in s2.
C++ provides several ways to compare strings, and each has
advantages. The simplest to use are the nonmember, overloaded operator
functions: operator ==, operator != operator >, operator
<, operator >=, and operator <=.
//: C03:CompStr.h
#ifndef COMPSTR_H
#define COMPSTR_H
#include <string>
#include "../TestSuite/Test.h"
using std::string;
class CompStrTest : public TestSuite::Test {
public:
void run() {
// Strings to compare
string s1("This");
string s2("That");
test_(s1 == s1);
test_(s1 != s2);
test_(s1 > s2);
test_(s1 >= s2);
test_(s1 >= s1);
test_(s2 < s1);
test_(s2 <= s1);
test_(s1 <= s1);
}
};
#endif // COMPSTR_H ///:~
//: C03:CompStr.cpp
//{L} ../TestSuite/Test
#include "CompStr.h"
int main() {
CompStrTest t;
t.run();
return t.report();
} ///:~
The overloaded comparison operators are useful for comparing
both full strings and individual string character elements.
Notice in the following example the flexibility of argument
types on both the left and right side of the comparison operators. For
efficiency, the string class provides overloaded operators for the
direct comparison of string objects, quoted literals, and pointers to C-style
strings without having to create temporary string objects.
//: C03:Equivalence.cpp
#include <iostream>
#include <string>
using namespace std;
int main() {
string s2("That"), s1("This");
// The lvalue is a quoted literal
// and the rvalue is a string:
if("That" == s2)
cout << "A match" << endl;
// The left operand is a string and the right is
// a pointer to a C-style null terminated string:
if(s1 != s2.c_str())
cout << "No match" << endl;
} ///:~
The c_str( ) function returns a const char*
that points to a C-style, null-terminated string equivalent to the contents of
the string object. This comes in handy when you want to pass a string to
a standard C function, such as atoi( ) or any of the functions
defined in the <cstring> header. It is an error to use the value
returned by c_str( ) as non-const argument to any function.
You won’t find the logical not (!) or the logical
comparison operators (&& and ||) among operators for a
string. (Neither will you find overloaded versions of the bitwise C operators &,
|, ^, or ~.) The overloaded nonmember comparison operators
for the string class are limited to the subset that has clear, unambiguous
application to single characters or groups of characters.
The compare( ) member function offers you a
great deal more sophisticated and precise comparison than the nonmember
operator set. It provides overloaded versions to compare:
· Two complete strings.
· Part of either string to a complete string.
· Subsets of two strings.
The following example compares complete strings:
//: C03:Compare.cpp
// Demonstrates compare() and swap().
#include <cassert>
#include <string>
using namespace std;
int main() {
string first("This");
string second("That");
assert(first.compare(first) == 0);
assert(second.compare(second) == 0);
// Which is lexically greater?
assert(first.compare(second) > 0);
assert(second.compare(first) < 0);
first.swap(second);
assert(first.compare(second) < 0);
assert(second.compare(first) > 0);
} ///:~
The swap( ) function in this example does what
its name implies: it exchanges the contents of its object and argument. To
compare a subset of the characters in one or both strings, you add arguments
that define where to start the comparison and how many characters to consider.
For example, we can use the following overloaded version of compare( ):
s1.compare(s1StartPos, s1NumberChars, s2, s2StartPos,
s2NumberChars);
Here’s an example:
//: C03:Compare2.cpp
// Illustrate overloaded compare().
#include <cassert>
#include <string>
using namespace std;
int main() {
string first("This is a day that will live in
infamy");
string second("I don't believe that this is what
"
"I signed up for");
// Compare "his is" in both strings:
assert(first.compare(1, 7, second, 22, 7) == 0);
// Compare "his is a" to "his is w":
assert(first.compare(1, 9, second, 22, 9) < 0);
} ///:~
In the examples so far, we have used C-style array indexing
syntax to refer to an individual character in a string. C++ strings provide an
alternative to the s[n] notation: the at( ) member. These two indexing mechanisms produce the same result in C++ if all goes well:
//: C03:StringIndexing.cpp
#include <cassert>
#include <string>
using namespace std;
int main() {
string s("1234");
assert(s[1] == '2');
assert(s.at(1) == '2');
} ///:~
There is one important difference, however, between [ ]
and at( ). When you try to reference an array element that is out
of bounds, at( ) will do you the kindness of throwing an exception,
while ordinary [ ] subscripting syntax will leave you to your own
devices:
//: C03:BadStringIndexing.cpp
#include <exception>
#include <iostream>
#include <string>
using namespace std;
int main() {
string s("1234");
// at() saves you by throwing an exception:
try {
s.at(5);
} catch(exception& e) {
cerr << e.what() << endl;
}
} ///:~
Responsible programmers will not use errant indexes, but
should you want to benefits of automatic index checking, using at( ) in
place of [ ] will give you a chance to gracefully recover from
references to array elements that don’t exist. Execution of this program on one
of our test compilers gave the following output:
The at( ) member throws an object of class out_of_range,
which derives (ultimately) from std::exception. By catching this object
in an exception handler, you can take appropriate remedial actions such as
recalculating the offending subscript or growing the array. Using string::operator[ ]( )
gives no such protection and is as dangerous as char array processing in
C.
The program Find.cpp earlier in this chapter leads us
to ask the obvious question: Why isn’t case-insensitive comparison part of the
standard string class? The answer provides interesting background on the
true nature of C++ string objects.
Consider what it means for a character to have “case.”
Written Hebrew, Farsi, and Kanji don’t use the concept of upper- and lowercase,
so for those languages this idea has no meaning. It would seem that if there
were a way to designate some languages as “all uppercase” or “all lowercase,”
we could design a generalized solution. However, some languages that employ the
concept of “case” also change the meaning of particular characters with
diacritical marks, for example: the cedilla in Spanish, the circumflex in
French, and the umlaut in German. For this reason, any case-sensitive collating
scheme that attempts to be comprehensive will be nightmarishly complex to use.
Although we usually treat the C++ string as a class,
this is really not the case. The string type is a specialization of a
more general constituent, the basic_string< >
template. Observe how string is declared in the Standard C++ header file:
typedef basic_string<char> string;
To understand the nature of the string class, look at the basic_string< >
template:
template<class charT, class traits =
char_traits<charT>,
class allocator =
allocator<charT> > class basic_string;
In Chapter 5, we examine templates in great detail (much
more than in Chapter 16 of Volume 1). For now, just notice that the string
type is created when the basic_string template is instantiated with char.
Inside the basic_string< > template declaration, the
line:
class traits = char_traits<charT>,
tells us that the behavior of the class made from the basic_string< >
template is specified by a class based on the template char_traits< >.
Thus, the basic_string< > template produces
string-oriented classes that manipulate types other than char (wide
characters, for example). To do this, the char_traits< > template
controls the content and collating behaviors of a variety of character sets
using the character comparison functions eq( ) (equal), ne( )
(not equal), and lt( ) (less than). The basic_string< >
string comparison functions rely on these.
This is why the string class doesn’t include
case-insensitive member functions: that’s not in its job description. To change
the way the string class treats character comparison, you must supply a
different char_traits< > template because that defines
the behavior of the individual character comparison member functions.
You can use this information to make a new type of string
class that ignores case. First, we’ll define a new case-insensitive char_traits< >
template that inherits from the existing template. Next, we’ll override only
the members we need to change to make character-by-character comparison case
insensitive. (In addition to the three lexical character comparison members
mentioned earlier, we’ll also supply a new implementation for the char_traits
functions find( ) and compare( )) . Finally, we’ll typedef
a new class based on basic_string, but using the case-insensitive ichar_traits
template for its second argument:
//: C03:ichar_traits.h
// Creating your own character traits.
#ifndef ICHAR_TRAITS_H
#define ICHAR_TRAITS_H
#include <cassert>
#include <cctype>
#include <cmath>
#include <cstddef>
#include <ostream>
#include <string>
using std::allocator;
using std::basic_string;
using std::char_traits;
using std::ostream;
using std::size_t;
using std::string;
using std::toupper;
using std::tolower;
struct ichar_traits : char_traits<char> {
// We'll only change character-by-
// character comparison functions
static bool eq(char c1st, char c2nd) {
return toupper(c1st) == toupper(c2nd);
}
static bool ne(char c1st, char c2nd) {
return !eq(c1st, c2nd);
}
static bool lt(char c1st, char c2nd) {
return toupper(c1st) < toupper(c2nd);
}
static int
compare(const char* str1, const char* str2, size_t n)
{
for(size_t i = 0; i < n; ++i) {
if(str1 == 0)
return -1;
else if(str2 == 0)
return 1;
else if(tolower(*str1) < tolower(*str2))
return -1;
else if(tolower(*str1) > tolower(*str2))
return 1;
assert(tolower(*str1) == tolower(*str2));
++str1; ++str2; // Compare the other chars
}
return 0;
}
static const char*
find(const char* s1, size_t n, char c) {
while(n-- > 0)
if(toupper(*s1) == toupper(c))
return s1;
else
++s1;
return 0;
}
};
typedef basic_string<char, ichar_traits> istring;
inline ostream& operator<<(ostream& os,
const istring& s) {
return os << string(s.c_str(), s.length());
}
#endif // ICHAR_TRAITS_H ///:~
We provide a typedef named istring so that our
class will act like an ordinary string in every way, except that it will
make all comparisons without respect to case. For convenience, we’ve also
provided an overloaded operator<<( ) so that you can print istrings.
Here’s an example:
//: C03:ICompare.cpp
#include <cassert>
#include <iostream>
#include "ichar_traits.h"
using namespace std;
int main() {
// The same letters except for case:
istring first = "tHis";
istring second = "ThIS";
cout << first << endl;
cout << second << endl;
assert(first.compare(second) == 0);
assert(first.find('h') == 1);
assert(first.find('I') == 2);
assert(first.find('x') == string::npos);
} ///:~
This is just a toy example. To make istring fully
equivalent to string, we’d have to create the other functions necessary
to support the new istring type.
The <string> header provides a wide string
class via the following typedef:
typedef basic_string<wchar_t> wstring;
Wide string support also reveals itself in wide streams
(wostream in place of ostream, also defined in <iostream>)
and in the header <cwctype>, a wide-character version of <cctype>.
This along with the wchar_t specialization of char_traits in the
standard library allows us to do a wide-character version of ichar_traits:
//: C03:iwchar_traits.h {-g++}
// Creating your own wide-character traits.
#ifndef IWCHAR_TRAITS_H
#define IWCHAR_TRAITS_H
#include <cassert>
#include <cmath>
#include <cstddef>
#include <cwctype>
#include <ostream>
#include <string>
using std::allocator;
using std::basic_string;
using std::char_traits;
using std::size_t;
using std::towlower;
using std::towupper;
using std::wostream;
using std::wstring;
struct iwchar_traits : char_traits<wchar_t> {
// We'll only change character-by-
// character comparison functions
static bool eq(wchar_t c1st, wchar_t c2nd) {
return towupper(c1st) == towupper(c2nd);
}
static bool ne(wchar_t c1st, wchar_t c2nd) {
return towupper(c1st) != towupper(c2nd);
}
static bool lt(wchar_t c1st, wchar_t c2nd) {
return towupper(c1st) < towupper(c2nd);
}
static int compare(
const wchar_t* str1, const wchar_t* str2, size_t n)
{
for(size_t i = 0; i < n; i++) {
if(str1 == 0)
return -1;
else if(str2 == 0)
return 1;
else if(towlower(*str1) < towlower(*str2))
return -1;
else if(towlower(*str1) > towlower(*str2))
return 1;
assert(towlower(*str1) == towlower(*str2));
++str1; ++str2; // Compare the other wchar_ts
}
return 0;
}
static const wchar_t*
find(const wchar_t* s1, size_t n, wchar_t c) {
while(n-- > 0)
if(towupper(*s1) == towupper(c))
return s1;
else
++s1;
return 0;
}
};
typedef basic_string<wchar_t, iwchar_traits>
iwstring;
inline wostream& operator<<(wostream& os,
const iwstring& s) {
return os << wstring(s.c_str(), s.length());
}
#endif // IWCHAR_TRAITS_H ///:~
As you can see, this is mostly an exercise in placing a ‘w’
in the appropriate place in the source code. The test program looks like this:
//: C03:IWCompare.cpp {-g++}
#include <cassert>
#include <iostream>
#include "iwchar_traits.h"
using namespace std;
int main() {
// The same letters except for case:
iwstring wfirst = L"tHis";
iwstring wsecond = L"ThIS";
wcout << wfirst << endl;
wcout << wsecond << endl;
assert(wfirst.compare(wsecond) == 0);
assert(wfirst.find('h') == 1);
assert(wfirst.find('I') == 2);
assert(wfirst.find('x') == wstring::npos);
} ///:~
Unfortunately, some compilers still do not provide robust
support for wide characters.
If you’ve looked at the sample code
in this book closely, you’ve noticed that certain tokens in the comments
surround the code. These are used by a Python program that Bruce wrote to
extract the code into files and set up makefiles for building the code. For
example, a double-slash followed by a colon at the beginning of a line denotes
the first line of a source file. The rest of the line contains information
describing the file’s name and location and whether it should be only compiled
rather than fully built into an executable file. For example, the first line in
the previous program above contains the string C03:IWCompare.cpp,
indicating that the file IWCompare.cpp should be extracted into the
directory C03.
The last line of a source file contains a triple-slash
followed by a colon and a tilde. If the first line has an exclamation point
immediately after the colon, the first and last lines of the source code are
not to be output to the file (this is for data-only files). (If you’re
wondering why we’re avoiding showing you these tokens, it’s because we don’t
want to break the code extractor when applied to the text of the book!)
Bruce’s Python program does a lot more than just extract
code. If the token “{O}” follows the file name, its makefile entry will
only be set up to compile the file and not to link it into an executable. (The
Test Framework in Chapter 2 is built this way.) To link such a file with
another source example, the target executable’s source file will contain an “{L}”
directive, as in:
This section will present a program to just extract all the
code so that you can compile and inspect it manually. You can use this program
to extract all the code in this book by saving the document file as a text file (let’s call it
TICV2.txt) and by executing something like the following on a shell command
line:
C:> extractCode TICV2.txt /TheCode
This command reads the text file TICV2.txt and writes
all the source code files in subdirectories under the top-level directory /TheCode.
The directory tree will look like the following:
TheCode/
C0B/
C01/
C02/
C03/
C04/
C05/
C06/
C07/
C08/
C09/
C10/
C11/
TestSuite/
The source files containing the examples from each chapter
will be in the corresponding directory.
Here’s the program:
//: C03:ExtractCode.cpp {-edg} {RunByHand}
// Extracts code from text.
#include <cassert>
#include <cstddef>
#include <cstdio>
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
// Legacy non-standard C header for mkdir()
#if defined(__GNUC__) || defined(__MWERKS__)
#include <sys/stat.h>
#elif defined(__BORLANDC__) || defined(_MSC_VER) \
|| defined(__DMC__)
#include <direct.h>
#else
#error Compiler not supported
#endif
// Check to see if directory exists
// by attempting to open a new file
// for output within it.
bool exists(string fname) {
size_t len = fname.length();
if(fname[len-1] != '/' && fname[len-1] !=
'\\')
fname.append("/");
fname.append("000.tmp");
ofstream outf(fname.c_str());
bool existFlag = outf;
if(outf) {
outf.close();
remove(fname.c_str());
}
return existFlag;
}
int main(int argc, char* argv[]) {
// See if input file name provided
if(argc == 1) {
cerr << "usage: extractCode file
[dir]" << endl;
exit(EXIT_FAILURE);
}
// See if input file exists
ifstream inf(argv[1]);
if(!inf) {
cerr << "error opening file: "
<< argv[1] << endl;
exit(EXIT_FAILURE);
}
// Check for optional output directory
string root("./"); // current is default
if(argc == 3) {
// See if output directory exists
root = argv[2];
if(!exists(root)) {
cerr << "no such directory: "
<< root << endl;
exit(EXIT_FAILURE);
}
size_t rootLen = root.length();
if(root[rootLen-1] != '/' &&
root[rootLen-1] != '\\')
root.append("/");
}
// Read input file line by line
// checking for code delimiters
string line;
bool inCode = false;
bool printDelims = true;
ofstream outf;
while(getline(inf, line)) {
size_t findDelim = line.find("//"
"/:~");
if(findDelim != string::npos) {
// Output last line and close file
if(!inCode) {
cerr << "Lines out of order"
<< endl;
exit(EXIT_FAILURE);
}
assert(outf);
if(printDelims)
outf << line << endl;
outf.close();
inCode = false;
printDelims = true;
} else {
findDelim = line.find("//"
":");
if(findDelim == 0) {
// Check for '!' directive
if(line[3] == '!') {
printDelims = false;
++findDelim; // To skip '!' for next search
}
// Extract subdirectory name, if any
size_t startOfSubdir =
line.find_first_not_of(" \t",
findDelim+3);
findDelim = line.find(':', startOfSubdir);
if(findDelim == string::npos) {
cerr << "missing filename
information\n" << endl;
exit(EXIT_FAILURE);
}
string subdir;
if(findDelim > startOfSubdir)
subdir = line.substr(startOfSubdir,
findDelim -
startOfSubdir);
// Extract file name (better be one!)
size_t startOfFile = findDelim + 1;
size_t endOfFile =
line.find_first_of(" \t",
startOfFile);
if(endOfFile == startOfFile) {
cerr << "missing filename"
<< endl;
exit(EXIT_FAILURE);
}
// We have all the pieces; build fullPath name
string fullPath(root);
if(subdir.length() > 0)
fullPath.append(subdir).append("/");
assert(fullPath[fullPath.length()-1] == '/');
if(!exists(fullPath))
#if defined(__GNUC__) || defined(__MWERKS__)
mkdir(fullPath.c_str(), 0); // Create subdir
#else
mkdir(fullPath.c_str()); // Create subdir
#endif
fullPath.append(line.substr(startOfFile,
endOfFile - startOfFile));
outf.open(fullPath.c_str());
if(!outf) {
cerr << "error opening "
<< fullPath
<< " for output"
<< endl;
exit(EXIT_FAILURE);
}
inCode = true;
cout << "Processing " <<
fullPath << endl;
if(printDelims)
outf << line << endl;
}
else if(inCode) {
assert(outf);
outf << line << endl; // Output middle
code line
}
}
}
exit(EXIT_SUCCESS);
} ///:~
First, you’ll notice some conditional compilation directives.
The mkdir( ) function, which creates a directory in the file
system, is defined by the POSIX standard
in the header <sys/stat.h>. Unfortunately, many compilers still
use a different header (<direct.h>). The respective signatures for
mkdir( ) also differ: POSIX specifies two arguments, the older
versions just one. For this reason, there is more conditional compilation later
in the program to choose the right call to mkdir( ). We normally
don’t use conditional compilation in the examples in this book, but this
particular program is too useful not to put a little extra work into, since you
can use it to extract all the code with it.
The exists( ) function in ExtractCode.cpp
tests whether a directory exists by opening a temporary file in it. If the open
fails, the directory doesn’t exist. You remove a file by sending its name as a char*
to std::remove( ).
The main program validates the command-line arguments and
then reads the input file a line at a time, looking for the special source code
delimiters. The Boolean flag inCode indicates that the program is in the
middle of a source file, so lines should be output. The printDelims flag
will be true if the opening token is not followed by an exclamation point;
otherwise the first and last lines are not written. It is important to check
for the closing delimiter first, because the start token is a subset, and
searching for the start token first would return a successful find for both
cases. If we encounter the closing token, we verify that we are in the middle
of processing a source file; otherwise, something is wrong with the way the
delimiters are laid out in the text file. If inCode is true, all is
well, and we (optionally) write the last line and close the file. When the
opening token is found, we parse the directory and file name components and
open the file. The following string-related functions were used in this
example: length( ), append( ), getline( ), find( )
(two versions), find_first_not_of( ), substr( ), find_first_of( ),
c_str( ), and, of course, operator<<( ).
C++ string objects provide developers with a number
of great advantages over their C counterparts. For the most part, the string
class makes referring to strings with character pointers unnecessary. This
eliminates an entire class of software defects that arise from the use of
uninitialized and incorrectly valued pointers.
C++ strings dynamically and transparently grow their
internal data storage space to accommodate increases in the size of the string
data. When the data in a string grows beyond the limits of the memory initially
allocated to it, the string object will make the memory management calls that
take space from and return space to the heap. Consistent allocation schemes
prevent memory leaks and have the potential to be much more efficient than
“roll your own” memory management.
The string class member functions provide a fairly
comprehensive set of tools for creating, modifying, and searching in strings.
String comparisons are always case sensitive, but you can work around this by
copying string data to C-style null-terminated strings and using
case-insensitive string comparison functions, temporarily converting the data
held in string objects to a single case, or by creating a case-insensitive
string class that overrides the character traits used to create the basic_string
object.
Solutions
to selected exercises can be found in the electronic document The Thinking
in C++ Volume 2 Annotated Solution Guide, available for a small fee from www.MindView.net.
1. Write and test a function that reverses the order of the
characters in a string.
2. A palindrome is a word or group of words that read the same
forward and backward. For example “madam” or “wow.” Write a program that takes
a string argument from the command line and, using the function from the
previous exercise, prints whether the string was a palindrome or not.
3. Make your program from Exercise 2 return true even if
symmetric letters differ in case. For example, “Civic” would still return true
although the first letter is capitalized.
4. Change your program from Exercise 3 to ignore punctuation and
spaces as well. For example “Able was I, ere I saw Elba.” would report true.
5. Using the following string declarations and only chars (no
string literals or magic numbers):
string one("I walked down the canyon with the
moving mountain bikers.");
string
two("The bikers passed by me too close for comfort.");
string three("I went hiking instead.");
produce the following sentence:
I
moved down the canyon with the mountain bikers. The mountain bikers passed by
me too close for comfort. So I went hiking instead.
6. Write a program named replace that takes three
command-line arguments representing an input text file, a string to replace
(call it from), and a replacement string (call it to). The
program should write a new file to standard output with all occurrences of from
replaced by to.
7. Repeat the previous exercise but replace all instances of from
regardless of case.
8. Make your program from Exercise 3 take a filename from the command-line,
and then display all words that are palindromes (ignoring case) in the file. Do
not display duplicates (even if their case differs). Do not try to look for
palindromes that are larger than a word (unlike in Exercise 4).
9. Modify HTMLStripper.cpp so that when it encounters a tag,
it displays the tag’s name, then displays the file’s contents between the tag
and the file’s ending tag. Assume no nesting of tags, and that all tags have
ending tags (denoted with </TAGNAME>).
10. Write a program that takes three command-line arguments (a
filename and two strings) and displays to the console all lines in the file
that have both strings in the line, either string, only one string, or neither
string, based on user input at the beginning of the program (the user will
choose which matching mode to use). For all but the “neither string” option,
highlight the input string(s) by placing an asterisk (*) at the beginning and
end of each string’s occurrence when it is displayed.
11. Write a program that takes two command-line arguments (a filename
and a string) and counts the number of times the string occurs in the file,
even as a substring (but ignoring overlaps). For example, an input string of “ba”
would match twice in the word “basketball,” but an input string of “ana” would
match only once in the word “banana.” Display to the console the number of
times the string is matched in the file, as well as the average length of the
words where the string occurred. (If the string occurs more than once in a
word, only count the word once in figuring the average.)
12. Write a program that takes a filename from the command line and
profiles the character usage, including punctuation and spaces (all character
values of 0x21 [33] through 0x7E [126], as well as the space character). That is,
count the number of occurrences of each character in the file, then display the
results sorted either sequentially (space, then !, ", #, etc.) or by
ascending or descending frequency based on user input at the beginning of the
program. For space, display the word “Space” instead of the character ' '. A
sample run might look something like this:
Format sequentially, ascending, or descending
(S/A/D): D
t: 526
r: 490
etc.
13. Using find( ) and rfind( ), write a
program that takes two command-line arguments (a filename and a string) and
displays the first and last words (and their indexes) not matching the string,
as well as the indexes of the first and last instances of the string. Display “Not
Found” if any of the searches fail.
14. Using the find_first_of “family” of functions (but not
exclusively), write a program that will remove all non-alphanumeric characters
except spaces and periods from a file, then capitalize the first letter
following a period.
15. Again using the find_first_of “family” of functions, write
a program that accepts a filename as a command-line argument and then formats
all numbers in the file to currency. Ignore decimal points after the first
until a non-numeric character is found, and round to the nearest hundredth. For
example, the string 12.399abc29.00.6a would be formatted (in the USA) to
$12.40abc$29.01a.
16. Write a program that accepts two command-line arguments (a
filename and a number) and scrambles each word in the file by randomly
switching two of its letters the number of times specified in the second
argument. (That is, if 0 is passed into your program from the command-line, the
words should not be scrambled; if 1 is passed in, one pair of randomly-chosen
letters should be swapped, for an input of 2, two random pairs should be
swapped, etc.).
17. Write a program that accepts a filename from the command line and
displays the number of sentences (defined as the number of periods in the
file), average number of characters per sentence, and the total number of
characters in the file.
18. Prove to yourself that the at( ) member function
really will throw an exception if an attempt is made to go out of bounds, and
that the indexing operator ([ ]) won’t.
You can do much more with the general
I/O problem than just take standard I/O and turn it into a class.
Wouldn’t it be nice if you could make all the usual
“receptacles”—standard I/O, files, and even blocks of memory—look the same so
that you need to remember only one interface? That’s the idea behind iostreams.
They’re much easier, safer, and sometimes even more efficient than the assorted
functions from the Standard C stdio library.
The iostreams classes are usually the first part of the C++
library that new C++ programmers learn to use. This chapter discusses how
iostreams are an improvement over C’s stdio facilities and explores the
behavior of file and string streams in addition to the standard console
streams.
You might wonder what’s wrong with the good old C library.
Why not “wrap” the C library in a class and be done with it? Sometimes this is a
fine solution. For example, suppose you want to make sure that the file
represented by a stdio FILE pointer is always safely opened and
properly closed without having to rely on the user to remember to call the close( )
function. The following program is such an attempt:
//: C04:FileClass.h
// stdio files wrapped.
#ifndef FILECLASS_H
#define FILECLASS_H
#include <cstdio>
#include <stdexcept>
class FileClass {
std::FILE* f;
public:
struct FileClassError : std::runtime_error {
FileClassError(const char* msg)
: std::runtime_error(msg) {}
};
FileClass(const char* fname, const char* mode =
"r");
~FileClass();
std::FILE* fp();
};
#endif // FILECLASS_H ///:~
When you perform file I/O in C, you work with a naked
pointer to a FILE struct, but this class wraps around the pointer and
guarantees it is properly initialized and cleaned up using the constructor and
destructor. The second constructor argument is the file mode, which defaults to
“r” for “read.”
To fetch the value of the pointer to use in the file I/O
functions, you use the fp( ) access function. Here are the member
function definitions:
//: C04:FileClass.cpp {O}
// FileClass Implementation.
#include "FileClass.h"
#include <cstdlib>
#include <cstdio>
using namespace std;
FileClass::FileClass(const char* fname, const char*
mode) {
if((f = fopen(fname, mode)) == 0)
throw FileClassError("Error opening
file");
}
FileClass::~FileClass() { fclose(f); }
FILE* FileClass::fp() { return
f; } ///:~
The constructor calls fopen( ), as you would
normally do, but it also ensures that the result isn’t zero, which indicates a
failure upon opening the file. If the file does not open as expected, an exception
is thrown.
The destructor closes the file, and the access function fp( )
returns f. Here’s a simple example using FileClass:
//: C04:FileClassTest.cpp
//{L} FileClass
#include <cstdlib>
#include <iostream>
#include "FileClass.h"
using namespace std;
int main() {
try {
FileClass f("FileClassTest.cpp");
const int BSIZE = 100;
char buf[BSIZE];
while(fgets(buf, BSIZE, f.fp()))
fputs(buf, stdout);
} catch(FileClass::FileClassError& e) {
cout << e.what() << endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
} // File automatically closed by destructor
///:~
You create the FileClass object and use it in normal
C file I/O function calls by calling fp( ). When you’re done with
it, just forget about it; the file is closed by the destructor at the end of
its scope.
Even though the FILE pointer is private, it isn’t
particularly safe because fp( ) retrieves it. Since the only effect
seems to be guaranteed initialization and cleanup, why not make it public or
use a struct instead? Notice that while you can get a copy of f
using fp( ), you cannot assign to f—that’s completely under
the control of the class. After capturing the pointer returned by fp( ),
the client programmer can still assign to the structure elements or even close
it, so the safety is in guaranteeing a valid FILE pointer rather than
proper contents of the structure.
If you want complete safety, you must prevent the user from
directly accessing the FILE pointer. Some version of all the normal file
I/O functions must show up as class members so that everything you can do with
the C approach is available in the C++ class:
//: C04:Fullwrap.h
// Completely hidden file IO.
#ifndef FULLWRAP_H
#define FULLWRAP_H
#include <cstddef>
#include <cstdio>
#undef getc
#undef putc
#undef ungetc
using std::size_t;
using std::fpos_t;
class File {
std::FILE* f;
std::FILE* F(); // Produces checked pointer to f
public:
File(); // Create object but don't open file
File(const char* path, const char* mode =
"r");
~File();
int open(const char* path, const char* mode =
"r");
int reopen(const char* path, const char* mode);
int getc();
int ungetc(int c);
int putc(int c);
int puts(const char* s);
char* gets(char* s, int n);
int printf(const char* format, ...);
size_t read(void* ptr, size_t size, size_t n);
size_t write(const void* ptr, size_t size, size_t n);
int eof();
int close();
int flush();
int seek(long offset, int whence);
int getpos(fpos_t* pos);
int setpos(const fpos_t* pos);
long tell();
void rewind();
void setbuf(char* buf);
int setvbuf(char* buf, int type, size_t sz);
int error();
void clearErr();
};
#endif // FULLWRAP_H ///:~
This class contains almost all the file I/O functions from <cstdio>.
(vfprintf( ) is missing; it implements the printf( ) member function.)
File has the same constructor as in the previous
example, and it also has a default constructor. The default constructor is
important if you want to create an array of File objects or use a File
object as a member of another class where the initialization doesn’t happen in
the constructor, but some time after the enclosing object is created.
The default constructor sets the private FILE pointer
f to zero. But now, before any reference to f, its value must be
checked to ensure it isn’t zero. This is accomplished with F( ),
which is private because it is intended to be used only by other member
functions. (We don’t want to give the user direct access to the underlying FILE
structure in this class.)
This approach is not a terrible solution by any means. It’s
quite functional, and you could imagine making similar classes for standard
(console) I/O and for in-core formatting (reading/writing a piece of memory
rather than a file or the console).
The stumbling block is the runtime interpreter used for the
variable argument list functions. This is the code that parses your format
string at runtime and grabs and interprets arguments from the variable argument
list. It’s a problem for four reasons.
1. Even if you use only a fraction of the functionality of the
interpreter, the whole thing gets loaded into your executable. So if you say printf("%c",
'x');, you’ll get the whole package, including the parts that print
floating-point numbers and strings. There’s no standard option for reducing the
amount of space used by the program.
2. Because the interpretation happens at runtime, you can’t get rid
of a performance overhead. It’s frustrating because all the information is there
in the format string at compile time, but it’s not evaluated until runtime.
However, if you could parse the arguments in the format string at compile time,
you could make direct function calls that have the potential to be much faster
than a runtime interpreter (although the printf( ) family of
functions is usually quite well optimized).
3. Because the format string is not evaluated until runtime, there
can be no compile-time error checking. You’re probably familiar with this problem if you’ve tried to find bugs that came from using the wrong number or type of
arguments in a printf( ) statement. C++ makes a big deal out of
compile-time error checking to find errors early and make your life easier. It
seems a shame to throw type safety away for an I/O library, especially since
I/O is used a lot.
4. For C++, the most crucial problem is that the printf( )
family of functions is not particularly extensible. They’re really designed to
handle only the basic data types in C (char, int, float, double,
wchar_t, char*, wchar_t*, and