Xiao Peng

My personal blog moves to xiaopeng.me , blogs about design patterns will be synced to here.
  博客园 :: 首页 :: 联系 :: 订阅 订阅 :: 管理



Posted on 2007-07-19 15:23 勇敢的鸵鸟 阅读(...) 评论(...) 编辑 收藏

Dale Emery:

I'm puzzling about the wisdom of using generics, especially early in the
development of a product.
Here's an example. Suppose I'm building a book catalog (as part of a larger
system). Almost immediately I want some kind of book list. In C#, I could
jump right to List<Book>. But if I do that, I automatically expose a
zillion methods that I may or may not want, and that may or may not work as
intended if called. It seems to me that if I expose List<Book> throughout
my product, I lose some control of the list. I don't know whether that's a
problem, but my gut tells me it is.
So instead I write my own BookList class. The BookList class, of course,
exposes only the few methods that my book catalog needs right now: Add(),
Delete(), Contains(), a few others. But after about the third test I find
that the easiest thing to do is add a List<Book> field and delegate
everything to it. So my BookList class is essentially a simplifying wrapper
around List<Book>.
Later I may add some features that List<Book> doesn't handle. But for now,
delegating is the easiest thing to do.
So that's how I do it. Something about this bugs me, though I can't put my
finger on just what.
What are your thoughts on this? Do you prefer to just use List<Book>, and
ignore the interface bloat? Do you write BookList from scratch and not
delegate to List<Book>? Do you do as I do and delegate? Or something else?
And above all: What benefits do you see to your preferred style? What
drawbacks to the other styles?


Ron Jeffries:

I used to struggle with this in Smalltalk. My default practice then
was to build my own collection class, so that I could give it
custom methods. But often, like you, I found myself wanting more and
more collection behavior, so that I'd have been mostly better off
just using a standard collection.
In Java / C#, I usually go directly to things like List<Book>, which
gives me a little type assistance, and the flexibility to go ahead
fairly smoothly.
Regarding "type assistance", I am not referring to gaining an
advantage from the strict typing. I hate that. But given that Java
/ C# have strict typing, using List<Book> at least lets me avoid
most of the gratuitous (Book) casts that I'd otherwise have to
However, what I'm liking most, when it happens, is when I have an
abstraction that isn't mostly collection, but is mostly something
else. In the reporting part of the shotgun app, we have Report,
which is of course a collection of Pages, and Page, which is a
collection of Paragraph (so far). These entities, though they do not
have much behavior, seem clearly to represent something that's
actually going on.
What this makes me want to do is ask what this List<Book> really is.
Is it a Library? Is it some kind of Recommendation? What's really
going on, I'll ask myself. Then I'll code that.
One more thing. As a rule, I think that collections faintly smell of
violation of the Law of Demeter. We get things out of collections,
and then do things to them, and then (often) put them back. So I
suspect, when I find myself looking for a List, that I'm quite
possibly looking in the wrong direction. Another clue that there may
be an abstraction missing.
Just me, just my thoughts ...
Ron Jeffries
It's easy to have a complicated idea.
It's very very hard to have a simple idea.
-- Carver Mead



This is a catalog of books, annotated with quotations and comments. I want
the basic bibliographic features of EndNote, combined with the terrific
note-taking features of EverNote. Some features I'd like:
- Quote specific passages (writing the quotation into the software), citing
page numbers.
- Annotate each book with any number of comments
- Associate each comment with the chunk of text (book, chapter, titled
section, page(s), quotations).
- Assign any number of topics to a book.
- Assign any number of topics to a comment.
- Distinguish my comments from quotations (my thoughts from the authors'
- Produce bibliographies in a variety of bibliographic styles.
- Produce a bibliography on a given topic, annotated with the book-level
comments I've entered for that topic.
- Search and sort books by title, author/editor, or other fields.
- Search books, quotations, and comments by topic.
- Search quotations and comments by content.
- Produce a list of comments I've made about a topic, with citations to the
chunk of text that inspired each.
- Maybe similar for journal and magazine articles, web sites, and other
text-based media.
Does that give you any ideas about what this List<Book> really is?



My own feeling is that generic collections don't communicate intent very
well. I see the same thing in over-reliance on primitives. When all
the data is of type String, the code starts getting very messy and
procedural. I think the same thing happens when the domain object is
just a collection, and the code to do something with that collection
resides wherever the results of that "doing something" is needed, rather
than encapsulated with the data.



Think in terms of how many classes will be dependant on BookList. If
it's a highly visible class, used throughout your app, you want an
interface so you can avoid refactoring when you eventually add the
findByAuthor() method.
On the other hand there's the YAGNI principle. If only a few classes
interact with the BookList, then it would be better to wait until you
know you need the added complexity and refactor at that point.



Greetings Dale,
I would tend to keep the wrapper class and name it BookCatalog. It's a
non-TDD reason, but either List<Book> or BookList would expose
implementation detail unnecessarily.
Some TDD/simple design reasons. I'm not sure any of these are very strong.
- using a List implies something about the catalog that you possibly
don't want: the assumption that it's ordered. So I think that'd be
violating expressiveness, and potentially YAGNI (of course, using
Collection or whatever C# analog would fix this problem)
- requiring clients to specify both type (Book) and collection
implementation (List) information in more than one place possibly
violates duplication. (Maybe the need to continually express type
information itself is redundant...)
- I think it's just easier, from a TDD perspective, to use the tests to
start describing things that clients can do with the collection. I'm
willing to bet that there will be some things you end up not wanting
clients to do with the collection; you could of course just wait and see.
I would rather build up abstractions like BookCatalog from scratch,
rather than try to control overblown implementations like List<Book>.



Hi Jeff,
> I would tend to keep the wrapper class and name it BookCatalog. It's a
> non-TDD reason, but either List<Book> or BookList would expose
> implementation detail unnecessarily.
I think that is indeed a TDD reason. Here's my thinking: The name
BookCatalog refers to a domain concept. The methods I put in the class
represent domain concepts. List<Book> includes methods that are not in my
domain, which injects noise into the communication. Also BookCatalog
describes more fully the domain concept I'm representing. Catalog suggests
searching and sorting better than does List.
So I'm applying this principle (related to the ones I described a few
messages ago): To refer to domain concepts in code, use names and coding
constructs that inject the least technical noise.



That's what I thought/hoped you meant. It seemed that some of the
thread was focusing (too much IMO) on the pros and cons of generics
when the issue is exactly the same if we expose, for example, an
My practice on this is to expose a collection to any object that
is going to use it purely as a collection. Most commonly, that's
the domain object that has the collection in it.
Sometimes, it's useful for me to expose the collection as a property
of the domain object, in order to avoid a lot of pass-through methods
and to create a syntax that makes sense to me. So, I might do
something like...
BookCatalog.Books.Add( myBook );
I find this quite expressive. It says that a BookCatalog /has/
a list of Books and that it (probably) has other methods and
properties of it's own.
While this does expose the list more than using a passthru method,
I can't see it as a universally bad thing. Of course, the exposed
property Books, should normally not be of the implementation class,
but should be an interface type like ICollection<Book>.
When to introduce this is another question. If I start with nothing
but a collection, I'll probably use it directly at first and create
the domain class as soon as I need a property or method on it.



Hi Charlie,
> That's what I thought/hoped you meant. It seemed that some of the
> thread was focusing (too much IMO) on the pros and cons of generics
> when the issue is exactly the same if we expose, for example, an
> ArrayList.
I think we can attribute most of that to my featuring that red herring in my
initial problem statement.
In retrospect, I can see that my puzzle wasn't just about generics, but
about prepackaged classes in general. And not just about using them, but
about exposing them. And not just about exposing them, but about exposing
them to clients of essential domain entities. And not just about exposing
them, but about defining essential domain entities in terms of
implementation technology.
If I were to restate the problem now, it seems kinda silly: I'm puzzling
about the wisdom of defining essential domain entities in terms of
implementation technology, and of exposing that underlying technology to
clients of essential domain entities.
If I state the problem that way, the solution is a laughably obvious.



Someplace lost in translation the class BookList got renamed to
BookCatalog but I believe they are the same class just different names
for the same thing. So assuming this then ...
What happens if you want to search by Publisher and Author, by only
Publisher, or only by Author? Do you code this?
public class BookCatalog {
List<Book> findBooksByAuthor( ... ) {..}
List<Book> findBooksByPublisher( ... ) {..}
List<Book> findBooksByAuthorAndPublisher( ... ) {..}
Or are you suggesting you only implement the first two methods, then
make it the responsibility of the object that calls the method to
search the other criteria.
And what happens if you want to add searches by publish date between
two days, publishing country, etc. The number of combinations for
parameters become endless, unless you are proposing that the object
making the call is responsible for subdividing the BookList to include
additional criteria.
A better way might be to code this:
public class BookCatalog {
BookCatalog findByAuthor( ... ) {..}
BookCatalog findByPublisher( ... ) {..}
BookCatalog findByPublishDates( DateTime,DateTime ) {..}
BookCatalog findByPublishingCountry( ... ) {..}
Now everything about the BookCatalog stays within the catalog and you
continue to enjoy new features added to it over time.
I'd also implement BookCatalog with an IBookCatalog interface as well
to provide other benefits. As well, I'd making it possible to
enumerate it by implementing IEnumerator<BookCatalog> and
IEnumerable<BookCatalog>. This way I am free to choose whatever
method is appropriate to store and enumerate the list of books;
whether it is a List<Book>, SortedList<key, Book>, Dictionary<key,Book>.
If I added a lot of criteria, I'd change the find logic to accepting
some kind of criteria object, such as:
public class BookCatalog {
public BookCatalog find( List<IBookFindCriteria> ... );
And so on...
- George