i'm told -- in a private backchannel -- that marcello said this: > Please google-books for "editions:ISBN1593082010" > and tell us which one of the 289 editions this search > turns up is in your mind the platinum-iridium one, > the formatting of which we have to follow in our ebook edition. > P.S. this is *the* most downloaded fiction book, so we have to do it right. i hate to criticize the person who backchanneled this to me, because if marcello ever _does_ make a valid point, i would _surely_ like to be informed of it. but use common sense... because this "point" is a ridiculously stupid one; it shouldn't have wasted anyone's time, ever, not mine or anyone else's. however, because i'm sure _some_ of you lurkers out there don't think these things through, i will answer this "point"... *** the book in question, in case you are wondering, is jane austen's "pride and prejudice", an exceeding popular public-domain book. nonetheless, in spite of the many commerical printings of it, there are only _4_ public-domain full-view versions at google. (the fifth one only includes volume 2 of this 2-volume work.) these four editions are from 1844, 1853, 1870, and 1892. the answer to the question as to which set of linebreaks and pagebreaks to use is this: the ones in the edition you digitize. plain old common sense. if you didn't already know the answer, perhaps you might want to exercise your brain a little bit more... if you're digitizing the 1844, use its linebreaks and pagebreaks. if you're digitizing the 1853, use its linebreaks and pagebreaks. if you're digitizing the 1870, use its linebreaks and pagebreaks. if you're digitizing the 1892, use its linebreaks and pagebreaks. if you'd prefer to create a "generic" version -- a "p.g. edition" (which, by the way, as i've often said before, i think is fine) -- then use whatever set of linebreaks and pagebreaks you want. you can even feel free to create a new set of them, if you like. as long as you're not representing the text as being "faithful" to the extant volumes mentioned, i have no problem with it... (but that's me. and you might want to keep in mind that there are _some_people_ who have tried very hard to cast asperions on the p.g. e-texts precisely because some are not "faithful" to a specific edition. you might say it's a blatent smear attempt, and i would agree with you, but it has been used _frequently_. so it depends on if you want to make yourself vulnerable to it.) as for which of the four versions you _should_ digitize, who cares? maybe some scholars would "approve" of one of them, and others approve of another one, but all 4 of these editions were present in some highly-esteemed university libraries, so i don't think it matters. indeed, i see zero reason why you shouldn't digitize all four of them. *** but even though this question can be brushed aside without effort, i can recognize an interesting experiment when one presents itself. so i took this on, and my word, did it ever live up to its potential... first, i downloaded all 4 of the versions, and i've presented below the actual text taken from first two paragraphs from each edition. i put the scans of the first page on my site for your convenience: > http://z-m-l.com/go/pap/pride_and_prejudice1844.jpg > http://z-m-l.com/go/pap/pride_and_prejudice1853.jpg > http://z-m-l.com/go/pap/pride_and_prejudice1870.jpg > http://z-m-l.com/go/pap/pride_and_prejudice1892v1.jpg or, if you prefer to view them all on a single page: > http://z-m-l.com/go/pap/pride_and_prejudice(4).html as you can see, even in just these _first_two_paragraphs,_ the linebreaks are _completely_different_ in every edition... also note there are _text_differences_ between the editions. two of the editions have a comma after "of a good fortune", but the other two do not. three of the editions put a comma after "surrounding families", while the fourth edition did not... in addition, and more striking, three editions use the _british_ spelling of "neighbourhood", while the fourth one does not... lastly, as far as the pagebreaks, every one of the 4 editions breaks _this_first_page_ on a _completely_different_line_... thus, when you pair up your text with a particular scan-set, you will want linebreaks and pagebreaks to match the scans. and the easiest way to ensure you get it right, of course, is to do o.c.r. on the particular scan-set that you want to digitize... *** here are the first two paragraphs from "pride and prejudice", as exemplified in 4 separate printings between 1844 and 1892. the 1844 version: > http://z-m-l.com/go/pap/pride_and_prejudice1844.jpg > > It is a truth universally acknowledged, that a single > man in possession of a good fortune, must be in want > of a wife. > > However little know the feelings or views of such > a man may be on his first entering a neighbourhood, > this truth is so well fixed in the minds of the sur- > rounding families, that he is considered as the right- > ful property of some one or other of their daughters. *** the 1853 version: > http://z-m-l.com/go/pap/pride_and_prejudice1853.jpg > > It is a truth universally acknowledged, that a single man > in possession of a good fortune must be in want of a wife. > > However little know the feelings or views of such a > man may be on his first entering a neighbourhood, this > truth is so well fixed in the minds of the surrounding > families, that he is considered as the rightful property of > some one or other of their daughters. *** the 1870 version: > http://z-m-l.com/go/pap/pride_and_prejudice1870.jpg > > It is a truth universally acknowledged, that a single > man in possession of a good fortune, must be in want of a > wife. > > However little know the feelings or views of such a man > may be on his first entering a neighbourhood, this truth is so > well fixed in the minds of the surrounding families, that he is > considered as the rightful property of some one or other of > their daughters. *** the 1892 version: > http://z-m-l.com/go/pap/pride_and_prejudice1892v1.jpg > > It is a truth universally acknowledged, > that a single man in possession of a > good fortune must be in want of a > wife. > > However little know the feelings or views of > such a man may be on his first entering a neigh- > borhood, this truth is so well fixed in the minds of > the surrounding families that he is considered as > the rightful property of some one or other of their > daughters. *** all in all, it's an obvious answer to the question, don't you think? -bowerbird