From Bowerbird at aol.com Mon Nov 3 08:48:20 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 3 Nov 2008 11:48:20 EST Subject: [gutvol-d] a picture of a thousand words Message-ID: here's a message from a long-time sharp e-book observer: > This blog post from Google discusses that they can now search > through scanned images inside PDFs. > > http://googleblog.http://gohttp://gohttp://googhttp://googlhttp://goo > > I think this technology has the potential to make ebooks > significantly more widespread. Up to now, getting from an old > out-of-print paper book to an ebook involved a considerable > amount of effort, especially in the proof-reading phase. > > The upside to that effort was that the book became not only > readable, but also searchable. But now you don't need to go > through that effort. > > With technology like this being widespread, it now becomes > practical to simply scan older works, and publish as a series of > scans. If anyone wants to search them, they can - but there is no > requirement to invest a lot of up-front effort in getting to a > perfect ebook before you know what the interest - and value - is > in that effort. > > I can see a further benefit to scholarly research, because when > you never discard the scan - it becomes the ebook, in a sense, > you no longer have to be concerned as much about potential errors > in the OCR and proof-read stages. If there is any doubt, you can > immediately check the underlying page. > > We aren't quite at the point where a home scanner can accept a > paperback and simply scan it into a readable ebook, but we're > getting closer with steps like this. > > It would be nice if at some point, the effort in converting my old > paper books to ebooks was comparable in time, effort and cost to > converting my old 78s to MP3s. > > : chris smith :::::::::::: : chris : chris : chris : chris : > : nihil tam munitum quod non expugnari pecunia possit - cicero : the notion that we no longer need to do o.c.r. on a scan-set -- because google will, and put the results in its database -- ignores the value of users actually _having_ that digital text... and the assumption that people who are currently doing o.c.r. simply "discard" the scans afterward is... um, well, it's outdated. (although it's still true in far too many cases, including with p.g.) still, from the standpoint of the person who'd digitize a book, it's definitely appealing to avoid the "up-front effort in getting to a perfect e-book before you know what the interest -- and value -- is in that effort", as chris put it... of course, there's nothing new here. google has been feeding the o.c.r. results of all its scanned books into its search engine all along, and will continue to do that. so we have searchability, we just don't have the power and flexibility of digital text itself... -bowerbird ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Nov 3 14:41:15 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 3 Nov 2008 17:41:15 EST Subject: [gutvol-d] a picture of a thousand words Message-ID: that u.r.l. got botched. it should be: > http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html -bowerbird ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 4 01:32:25 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 4 Nov 2008 04:32:25 EST Subject: [gutvol-d] ppgen makes its arrival Message-ID: so roger frank has finally brought his "ppgen" out into the open over on the distributed proofreaders site. ppgen is a means "to assist post-processors in generating e-texts"... it's a set of markup combined with a program to convert to various formats, currently html and l.r.f. (used, i believe, by the sony reader). the usual benefits from a single-source-master workflow are listed. > http://www.fadedpage.com/c/postprocessing.php > http://www.fadedpage.com/c/ppgenwalk.php *** there were _a_bunch_ of naysayers here -- and over at d.p. -- when i introduced z.m.l., which is a highly similar approach. so i'll be very curious to see how they treat this home-grown variant in comparison. meanwhile, ppgen is a command-line tool, with no g.u.i. i'm not sure why there's so little awareness in p.g./d.p. of the importance of a g.u.i., but hey, maybe that's because i've been a mac person for such a while... -bowerbird p.s. one of the books that roger has treated with ppgen is on bees. i noticed this one when it was posted because it had no d.p. credit... sure enough, it didn't go through d.p. perhaps roger has figured out that -- once you've cleaned up the o.c.r. yourself -- there's little need to run the text through the slow and often tortuous queues over at d.p. ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 4 11:44:57 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 4 Nov 2008 14:44:57 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: well, i think i have laid bare the truth about errors in p.g. e-texts. they exist. and it isn't even that difficult to find them, in general. now, to finish, some philosophical ramblings on the various and sundry attitudes on errors that once left a bloody battlefield here. those who have been here for a while (which is most of the 300 people subscribed, as i've followed the subscription list for years, and found this listserve to be remarkably stable) will remember that one person in particular -- along with a few of his friends -- waged a long campaign on the "untrustworthiness" of p.g. e-texts. this person argued that there were significant errors in some books, and "bowdlerization" of others, and this invalidated the whole library. luckily, his efforts to smear the p.g. corpus never got much traction... however, it did give rise to something quite unfortunate, which was an equally-ridiculous "we have no major errors" counter-attitude, which led many people here to categorically tune out any dialog on errors... the truth of the matter is that almost all of the p.g. e-texts have errors, but the vast majority of these errors are _not_ of a very serious nature... only a very small percentage of the books has extremely serious errors... one such book -- as jose menendez recently pointed out yet again -- is "peter pan at kensington gardens", which is missing 2 (out of 6) chapters. jose documented the same problem years ago, and it was never repaired. likewise, i have generated long lists of errors for several books, and had no action taken on them, as if the "people in charge" willfully ignored me, which is a very good representation of that "we have no errors" attitude... the general feeling i get is that p.g. wants to sweep errors under the carpet. the one exception to this is michael hart himself. while michael was -- at times -- a ringleader of the "no errors" position, he always represented a much more reasonable version of it, namely that he wanted to be presented a list of errors, rather than a vague accusation. considering that the "untrustworthy" spin had no solid weight behind it, that was a reasonable position for someone to take, and michael took it. but -- unlike others within the p.g. bureaucracy -- michael wasn't using this position as a "dodge". he sincerely wanted to know about the errors. when he has been presented with solid evidence, michael has accepted it, and been pleasant about it, realizing that it was offered up in good faith... unfortunately, no one else at p.g. has been nearly as responsive as michael. i get the distinct impression that no one there wants to know about errors. it's kinda sad, because one of the biggest complaints against p.g. e-texts -- indeed, i would say that it is far-and-away the _biggest_ complaint -- is they have too many errors in them... and it's a problem that's _fixable_. but you've got to have the right attitude toward _doing_ those repairs... and i see no evidence that the whitewashers have that attitude. yet it is the whitewashers who have _reserved_ that job _for_themselves_. error-repair has to work through the "reposting" process, and they have made that a rather serious (and therefore clumsy) process to go through. it all happens "in the back room", and there's no public transparency to it. the public is in the dark in regard to the error-reports that've been made, the number of those reports, the ongoing status of the reports, and so on. furthermore, this "shroud of secrecy" around the error-reports means that the public has little idea project gutenberg is interested in fixing its errors. which means that the public, by and large, doesn't bother to report errors. even though -- as i said -- the public's biggest complaint about the e-texts is the errors, i sense no mission on the part of the public to _report_errors_. they bellyache about the errors, but they don't report them. and this is a huge failing on the part of p.g., to motivate those error-reports. p.g. _has_ those well-known "million eyeballs" that can track down all bugs, but if none of those eyeballs are _reporting_ them, errors will go unrepaired. again, this is a huge failing. *** so here's a list of what could/should be done: 1. automate procedures that can locate and repair errors _programatically_. 2. streamline the process of updating e-texts with error-repaired editions... 3. create an error-reporting system that's both public and fully transparent. 4. as part of this system, link up p.g. e-texts to publicly-available scan-sets. 5. make announcements to the public to specifically request error-reports! 6. make sure that people who report errors get credit for doing that work... none of this is difficult to do. *** as i said, this battlefield has been marked by two extreme positions... throughout, i've maintained that p.g. e-texts have very few serious errors, but are marked (and marred) throughout by a large number of less-serious errors. walking the centerline has been thankless, and i have been quite disappointed. michael has had this reasonable attitude too, and i sense he's disappointed too. i've given suggestions on what needs to be done. but is there any chance of that ever being done? i doubt it. not if we let it be swept under the rug. so i will continue -- until i die -- to make noise, continue until people who hold the opposite opinion have no leg to stand on, continue pointing out steps that would solve these problems, so that when someone down the line says "let's do it like this" -- meaning _fix_the_errors_ -- there will be zero opposition. -bowerbird ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 4 12:08:43 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 4 Nov 2008 15:08:43 EST Subject: [gutvol-d] toward a philosophy of errors in e-texts Message-ID: here are some thoughts of mine on a philosophy of errors in e-texts. *** there are a variety of orientations people can have towards e-books here in the first decade of the 21st century. ironically, in this modern age, most of the books that are available to us right now as "e-books" are _very_old_ books, not new ones... this is because of the large scanning projects -- mostly google -- and the laws concerning the public domain. although the recent "proposed settlement" between google and the author/publishers might change this equation, for now what we have are old books. if a person is interested in an old book -- in the public domain -- there's some chance that they can find a copy of it online, for free. as more and more time passes, that chance will become a certainty. indeed, you can already find _multiple_ scan-sets of some books... it's good to realize that if all you want to do is _read_ such a book, these scan-sets are perfectly ok. just like the p-books of the past, which they mirror... nobody in the past ever complained that they couldn't "copy and paste" text from a p-book. they just retyped it. in "read-only" mode, errors in the p-book (and thus the scan-set) weren't much of a big deal. they existed, but they were fairly rare, and people learned to read around them, and everything was fine. so too with the scan-sets. *** however, if you want to do much of anything _more_ with a book than to read it, then it's downright handy to have it as digital text. we've grown accustomed to working with text in a digital format; the ability to copy-and-paste it, to edit and reformat it, is great... computer-aided search, in particular, is a fantastic leap forward... so it's only natural that we'd want to have our books as digital text. and, given the fluidity of that digital text, it's also only natural that we would want to have that digital text be as error-free as possible. grammarians frequently remind us that perfection is an "absolute", you can't be "almost perfect", just like you can't be "almost pregnant". something is either _perfect_, or it is _not_perfect_. never the twain. but we should recall that _the_march_to_perfection_ is a _process_, so yes sir, there are indeed "degrees of perfection" in _that_ sense... and we should always keep ourselves on the path toward perfection. *** so, as for the people who simply want to read a book, the scan-sets are just fine, and we recognize that, and leave them to their pleasure. *** but those of us who want to "do more" with books, as a _cyberlibrary_, have an obligation to clean up the errors in these books as best we can. i see three specific niches in this effort. the first one -- from the standpoint of history -- is project gutenberg. volunteers will do the best that they can to make books as digital text, and will take books as far along the road to perfection as they are able. distributed proofreaders, the current champion of that type of effort, is a good example, where a group of volunteers does this "for the public". wikisource is yet another good example of this "do it for others" niche... the second niche -- one that has historical roots in project gutenberg too, but one that (i believe) will return to prominence in the future -- are individuals who will digitize a book in which they have an interest. as the tools that facilitate digitization improve, to the extent that they have no learning curve, negating the need for "digitization experience", isolated individuals will be able to pick up those tools and code a book. it's quicker to do that than to take years to push the book through d.p. furthermore, this niche will be aided by a new policy of the o.c.a., which allows an individual to request that they scan (and o.c.r.) a specific book. this niche is the one where individuals do a digitization "for themselves", although they are then perfectly willing to share the results with others. collecting these individual efforts can happen naturally via the internet. the third sphere in the digitizing of books is those big scanning projects, which is essentially google right now, though the o.c.a. is a presence too, and smaller ones might emerge as well as time goes on. as we recognize, increasingly, the value of digital text compared to scan-sets, there will be an ever-continuing and increasing focus on error-correction of the o.c.r. my guess is that the google labs, all by themselves, will be able to produce error-free digital text, easily, based on their multiple scan-sets of a book... but even smaller efforts will seize upon the secret; i've shown it's not hard. *** so, how do the numbers look from each of these niches? p.g. -- thanks to a huge d.p. contribution -- adds ~3,500 books a year. that's quite commendable. the only troubling thing about that number is that it seems to have hit a plateau now, with no breakthrough visible... individual digitizers, operating independently of p.g., currently do about 1,000 books a year, i would estimate, although their results have not yet been centralized so that number could be verified, or taken advantage of. nonetheless, if this niche does indeed come to fruition soon, as i suspect, that problem might be solved. however, even if the numbers come to rise, so that they even surpass p.g./d.p./wikisource, they'll still be insignificant. the reason the numbers for those two niches will be insignificant is because google (all by itself) has already scanned _millions_ of books, quite literally. so efforts that do thousands of books a year -- or even tens of thousands -- are soon gonna fall hopelessly behind. as a matter of fact, they already have. google scans more books daily -- before lunch -- than d.p. makes in a year. all this means the only real hope for error-free digital text is _programmatic_ correction of o.c.r., with a _millions_of_eyeballs_ final-correction approach... so, does this mean that the first two niches should just give up? heavens no! they are pioneers who'll help test the tools that do programmatic corrections. it's vitally important that we, as the public, build such tools ourselves, so that google is not the only entity around with access to an error-free cyberlibrary. we have to demonstrate that we are capable of generating the same accuracy, so that google will make the decision to do it for us, for the glow of good will. otherwise, they'll be tempted to keep their accurate digital text to themselves, which is what they seem to have decided to do thus far. we must change that. -bowerbird ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajhaines at shaw.ca Tue Nov 4 13:59:16 2008 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Tue, 4 Nov 2008 13:59:16 -0800 Subject: [gutvol-d] philosophy towards errors in p.g. e-texts References: Message-ID: <000d01c93ec8$94cb42e0$6401a8c0@ahainesp2400> Re this snippet: >only a very small percentage of the books has extremely serious errors... >one such book -- as jose menendez recently pointed out yet again -- is >"peter pan at kensington gardens", which is missing 2 (out of 6) chapters. It's my understanding, from the people at PG who remember the history of this submission (PG #1332), that its source was possibly an old Penguin reprint that contained only four of the six stories. (Jose's version is based on the six-story Scribner's edition.) BTW--In September, bowerbird made these comments about the reposted 1332:: >you'd think that, upon being told that they had missed _one-third_ of a book, >they'd go back immediately and fix that huge problem, wouldn't you? i would. >yet this book managed to sit at p.g., undisturbed, for nearly 3 years after that. >that's bad enough. but even worse, then it was "reposted", _without_ being fixed! Given that 1332's source more than likely had only four stories, bowerbird owes both the original submitter and the reposter an apology. Al ----- Original Message ----- From: Bowerbird at aol.com To: gutvol-d at lists.pglaf.org ; Bowerbird at aol.com Sent: Tuesday, November 04, 2008 11:44 AM Subject: [gutvol-d] philosophy towards errors in p.g. e-texts well, i think i have laid bare the truth about errors in p.g. e-texts. they exist. and it isn't even that difficult to find them, in general. now, to finish, some philosophical ramblings on the various and sundry attitudes on errors that once left a bloody battlefield here. those who have been here for a while (which is most of the 300 people subscribed, as i've followed the subscription list for years, and found this listserve to be remarkably stable) will remember that one person in particular -- along with a few of his friends -- waged a long campaign on the "untrustworthiness" of p.g. e-texts. this person argued that there were significant errors in some books, and "bowdlerization" of others, and this invalidated the whole library. luckily, his efforts to smear the p.g. corpus never got much traction... however, it did give rise to something quite unfortunate, which was an equally-ridiculous "we have no major errors" counter-attitude, which led many people here to categorically tune out any dialog on errors... the truth of the matter is that almost all of the p.g. e-texts have errors, but the vast majority of these errors are _not_ of a very serious nature... only a very small percentage of the books has extremely serious errors... one such book -- as jose menendez recently pointed out yet again -- is "peter pan at kensington gardens", which is missing 2 (out of 6) chapters. jose documented the same problem years ago, and it was never repaired. likewise, i have generated long lists of errors for several books, and had no action taken on them, as if the "people in charge" willfully ignored me, which is a very good representation of that "we have no errors" attitude... the general feeling i get is that p.g. wants to sweep errors under the carpet. the one exception to this is michael hart himself. while michael was -- at times -- a ringleader of the "no errors" position, he always represented a much more reasonable version of it, namely that he wanted to be presented a list of errors, rather than a vague accusation. considering that the "untrustworthy" spin had no solid weight behind it, that was a reasonable position for someone to take, and michael took it. but -- unlike others within the p.g. bureaucracy -- michael wasn't using this position as a "dodge". he sincerely wanted to know about the errors. when he has been presented with solid evidence, michael has accepted it, and been pleasant about it, realizing that it was offered up in good faith... unfortunately, no one else at p.g. has been nearly as responsive as michael. i get the distinct impression that no one there wants to know about errors. it's kinda sad, because one of the biggest complaints against p.g. e-texts -- indeed, i would say that it is far-and-away the _biggest_ complaint -- is they have too many errors in them... and it's a problem that's _fixable_. but you've got to have the right attitude toward _doing_ those repairs... and i see no evidence that the whitewashers have that attitude. yet it is the whitewashers who have _reserved_ that job _for_themselves_. error-repair has to work through the "reposting" process, and they have made that a rather serious (and therefore clumsy) process to go through. it all happens "in the back room", and there's no public transparency to it. the public is in the dark in regard to the error-reports that've been made, the number of those reports, the ongoing status of the reports, and so on. furthermore, this "shroud of secrecy" around the error-reports means that the public has little idea project gutenberg is interested in fixing its errors. which means that the public, by and large, doesn't bother to report errors. even though -- as i said -- the public's biggest complaint about the e-texts is the errors, i sense no mission on the part of the public to _report_errors_. they bellyache about the errors, but they don't report them. and this is a huge failing on the part of p.g., to motivate those error-reports. p.g. _has_ those well-known "million eyeballs" that can track down all bugs, but if none of those eyeballs are _reporting_ them, errors will go unrepaired. again, this is a huge failing. *** so here's a list of what could/should be done: 1. automate procedures that can locate and repair errors _programatically_. 2. streamline the process of updating e-texts with error-repaired editions... 3. create an error-reporting system that's both public and fully transparent. 4. as part of this system, link up p.g. e-texts to publicly-available scan-sets. 5. make announcements to the public to specifically request error-reports! 6. make sure that people who report errors get credit for doing that work... none of this is difficult to do. *** as i said, this battlefield has been marked by two extreme positions... throughout, i've maintained that p.g. e-texts have very few serious errors, but are marked (and marred) throughout by a large number of less-serious errors. walking the centerline has been thankless, and i have been quite disappointed. michael has had this reasonable attitude too, and i sense he's disappointed too. i've given suggestions on what needs to be done. but is there any chance of that ever being done? i doubt it. not if we let it be swept under the rug. so i will continue -- until i die -- to make noise, continue until people who hold the opposite opinion have no leg to stand on, continue pointing out steps that would solve these problems, so that when someone down the line says "let's do it like this" -- meaning _fix_the_errors_ -- there will be zero opposition. -bowerbird ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) ------------------------------------------------------------------------------ _______________________________________________ gutvol-d mailing list gutvol-d at lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcello at perathoner.de Tue Nov 4 14:39:51 2008 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue, 04 Nov 2008 23:39:51 +0100 Subject: [gutvol-d] philosophy towards errors in p.g. e-texts In-Reply-To: <000d01c93ec8$94cb42e0$6401a8c0@ahainesp2400> References: <000d01c93ec8$94cb42e0$6401a8c0@ahainesp2400> Message-ID: <4910CF37.90707@perathoner.de> Al Haines (shaw) wrote: > Given that 1332's source more than likely had only four stories, bowerbird owes > both the original submitter and the reposter an apology. Just tell him to send any errors he finds to errata at pglaf.org and to stop pissing and moaning. Don't give him an audience here. From Bowerbird at aol.com Tue Nov 4 17:25:21 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 4 Nov 2008 20:25:21 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: al said: > It's my understanding, from the people at PG > who remember the history of this?submission > (PG #1332), that its source was?possibly > an old Penguin reprint that contained > only four of the six stories. > (Jose's version is based on the six-story Scribner's edition.) you know, al, it really would help the level of the dialog if you _honestly_ got yourself up to speed on the topic, and not just from "those who remember the history"... my mind is always open, so if you can point to such a "penguin reprint" -- i'll accept a pointer from o.c.l.c. -- i'd more than happily accept this version of the story... it might not even surprise me, because -- if i remember correctly -- the original submission was from ron burkey, who's more well-known as the developer of "gutenmark", and i wouldn't have expected him to make such a gaffe... but, you know, even the best of us make mistakes at times. however... if you'd consulted the full history of this particular report, you'd have seen david starner tried to mount this defense, on the bookpeople listserve where it was initially presented, and jose essentially destroyed it with extremely sound logic. if i recall it correctly, it had to do with characters appearing out of sequence in the 4-chapter version that's posted at p.g. and there might've been other stuff. you can go and read it... google "bookpeople kensington jose" to get it as the first hit: > http://onlinebooks.library.upenn.edu/webbin/bparchive?year=2005& post=2005-10-27,5 i am quite sure that there was enough evidence mounted that i would've taken the book down _immediately_ until i resolved the issue as to its soundness. yet not only was that _not_ done, but the book was eventually even reposted to its "new" location. in summary, even if there _is_ such a 4-chapter "penguin reprint", according to jose's arguments, it was a bad hatchet job on the book, and it should _not_ be representing the book at project gutenberg... there is no shortage of paper-books that showed it with 6 chapters. this is the type of thing that gave any credence at all to "that person" who was trying to sabotage the reputation of project gutenberg... in short, al, that excuse doesn't hold any water... which you would have known if you'd done your homework... > Given that 1332's source?more than likely had?only four stories, > bowerbird owes both the original submitter and the reposter an apology. my goodness. now you've moved from the realm of "possibly" to the position that it is "more than likely". sorry. i don't buy it. and i've just told you why. i think it's a pretty ludicrous excuse. oh, and no, al, you're wrong, i don't "owe" _anyone_ an apology. first, because you haven't presented any evidence that i'm wrong, and second, because i'm addressing _policy_, not _personalities_. but no, don't worry, al, i won't demand one from you in return, for leveling a silly excuse and countercharge. i'll just let people deduct whatever they deem necessary from your credibility in their eyes... -bowerbird p.s. al, it isn't necessary to cc: me a duplicate copy; i get the listserve. and marcello, it sure isn't necessary for _you_ to cc: me a duplicate, since everything from your address goes directly to my spam folder... ************** Plan your next getaway with AOL Travel. Check out Today's Hot 5 Travel Deals! (http://pr.atwola.com/promoclk/100000075x1212416248x1200771803/aol?redir=http://travel.aol.com/discount-travel?ncid=emlcntustrav00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebooks at ibiblio.org Wed Nov 5 11:00:18 2008 From: ebooks at ibiblio.org (Jose Menendez) Date: Wed, 05 Nov 2008 14:00:18 -0500 Subject: [gutvol-d] philosophy towards errors in p.g. e-texts In-Reply-To: References: Message-ID: <4911ED42.4050203@ibiblio.org> On Nov. 4, 2008, Bowerbird wrote: > al said: > > It's my understanding, from the people at PG > > who remember the history of this submission > > (PG #1332), that its source was possibly > > an old Penguin reprint that contained > > only four of the six stories. > > (Jose's version is based on the six-story Scribner's edition.) > > you know, al, it really would help the level of the dialog > if you _honestly_ got yourself up to speed on the topic, > and not just from "those who remember the history"... > > my mind is always open, so if you can point to such a > "penguin reprint" -- i'll accept a pointer from o.c.l.c. -- > i'd more than happily accept this version of the story... > it might not even surprise me, because -- if i remember > correctly -- the original submission was from ron burkey, > who's more well-known as the developer of "gutenmark", > and i wouldn't have expected him to make such a gaffe... > but, you know, even the best of us make mistakes at times. A few days after I posted my gutvol-d message about PG's "Peter Pan in Kensington Gardens" (on Sept. 12th) I received an email from someone at PG, saying that PG's ebook was based on an incomplete Penguin edition. I promptly checked Google Book Search and found a 1995 Penguin edition: http://books.google.com/books?id=TvKwAQAACAAJ But there's no preview available or any details about the book's contents. Using the ISBN number given by Google, 0146000773, however, I looked it up in WorldCat: http://www.worldcat.org/oclc/36563655?tab=details If you scroll down to the "Item Details," you'll see this: "Contents: Peter Pan -- The Thrush's nest -- The little house -- Lock-out time." So WorldCat only lists four chapters for that edition, and they're in the same order as in PG's ebook. Unfortunately, the nearest library to me with a copy is more than 400 miles away, so I wasn't able to examine a copy to see if it really has only those four chapters and, if so, whether it mentions that it's an abridged edition or whether Penguin claimed a copyright on such a substantially abridged and re-arranged edition. > however... > > if you'd consulted the full history of this particular report, > you'd have seen david starner tried to mount this defense, > on the bookpeople listserve where it was initially presented, > and jose essentially destroyed it with extremely sound logic. > > if i recall it correctly, it had to do with characters appearing > out of sequence in the 4-chapter version that's posted at p.g. > and there might've been other stuff. you can go and read it... > google "bookpeople kensington jose" to get it as the first hit: > > http://onlinebooks.library.upenn.edu/webbin/bparchive?year=2005&post=2005-10-27,5 Besides the things I pointed out in that post to the Book People mailing list back in 2005, we only have to look at the very first paragraph in PG's ebook to see that it's a defective edition: http://www.gutenberg.org/dirs/1/3/3/1332/1332-h/1332-h.htm#2H_4_0001 If you ask your mother whether she knew about Peter Pan when she was a little girl she will say, "Why, of course, I did, child," and if you ask her whether he rode on a goat in those days she will say, "What a foolish question to ask, certainly he did." Then if you ask your grandmother whether she knew about Peter Pan when she was a girl, she also says, "Why, of course, I did, child," but if you ask her whether he rode on a goat in those days, she says she never heard of his having a goat. Perhaps she has forgotten, just as she sometimes forgets your name and calls you Mildred, which is your mother's name. Still, she could hardly forget such an important thing as the goat. Therefore there was no goat when your grandmother was a little girl. This shows that, in telling the story of Peter Pan, to begin with the goat (as most people do) is as silly as to put on your jacket before your vest. The word "goat" appears 6 times just in that first paragraph, but it doesn't appear again in the whole ebook. Not once! Surprising, isn't it, considering that in that first paragraph Barrie referred to the goat as "such an important thing"? The reason "goat" doesn't appear again is that the chapter that explains how Peter Pan got the goat, "Peter's Goat," is missing. And here's an example of a continuity problem due to the last two chapters being reversed. This link goes to the third chapter, "The Little House," in PG's HTML version: http://www.gutenberg.org/dirs/1/3/3/1332/1332-h/1332-h.htm#2H_4_0003 Now scroll up a little to the end of the previous chapter, "The Thrush's Nest." Here are the last few lines from that chapter: Of course, he had no mother--at least, what use was she to him? You can be sorry for him for that, but don't be too sorry, for the next thing I mean to tell you is how he revisited her. It was the fairies who gave him the chance. But the chapter following those lines, "The Little House," says *nothing* about how Peter Pan revisited his mother. To read about that you have to skip to the last chapter in the ebook, "Lock-out Time": http://www.gutenberg.org/dirs/1/3/3/1332/1332-h/1332-h.htm#2H_4_0004 > i am quite sure that there was enough evidence mounted that > i would've taken the book down _immediately_ until i resolved > the issue as to its soundness. yet not only was that _not_ done, > but the book was eventually even reposted to its "new" location. > > in summary, even if there _is_ such a 4-chapter "penguin reprint", > according to jose's arguments, it was a bad hatchet job on the book, > and it should _not_ be representing the book at project gutenberg... If that WorldCat description is correct and Penguin's edition only had those four chapters and in that order, it was a "hatchet job." I'd like to see a copy of it to see if it has an introduction or note from the editor explaining what was done to the book and why. > there is no shortage of paper-books that showed it with 6 chapters. > this is the type of thing that gave any credence at all to "that person" > who was trying to sabotage the reputation of project gutenberg... > > in short, al, that excuse doesn't hold any water... > which you would have known if you'd done your homework... Well, Bowerbird, if Al or you had done your "homework," either one of you could have found both that Google Book Search listing and the WorldCat listing in just a couple of minutes. :) > > Given that 1332's source more than likely had only four stories, > > bowerbird owes both the original submitter and the reposter an apology. > > my goodness. now you've moved from the realm of "possibly" > to the position that it is "more than likely". sorry. i don't buy it. > and i've just told you why. i think it's a pretty ludicrous excuse. From that WorldCat listing, it seems that Al's excuse is *very* likely. That doesn't alter the fact that it's an incomplete, defective edition. And as I replied to the person from PG who told me the ebook was based on a Penguin edition, "My biggest problem with PG's version isn't that it's incomplete, but that it doesn't mention that it's incomplete, which misleads readers into thinking they're getting the whole book." Jose Menendez P.S. I've been rather busy lately, but I hope to respond to some other posts in the near future. From Bowerbird at aol.com Wed Nov 5 12:41:36 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 5 Nov 2008 15:41:36 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: i said: > however, it did give rise to something quite unfortunate, which was an > equally-ridiculous "we have no major errors" counter-attitude, which > led many people here to categorically tune out any dialog on errors... once i'd put that "we have no major errors" line in their mouth, i had to pause, and ask myself "is that really accurate?", and so i thought about it, and said, "well, yeah, i know that it sounds extreme, but it _is_ accurate"... so i left it in. it still sounded a bit extreme on re-reading. but i left it in. now that we've seen how al reacted, you too can see that it was accurate. i've always maintained that the number of badly-botched books is small. very small. i'd put the number at 27 or less. out of 27,000, that is tiny... in a library of some ~27,000 files, it works out to just one in a thousand. so, if you understand the magnitude of what i'm saying, it's not that bad. in fact, it helps to remember it was a _counterargument_ to "that person" who was trying to impugn the entire library based on these "bad apples". (he was always vague on what they were, or how many of them there are, because that left a troubling impression that there are far more of them.) yet even with just _1_ of these badly-botched books being documented, al _still_ wants to maintain that "we have no major error" on that e-text... he wants to say that if there are only 4 chapters in this particular e-text, there must have only been 4 chapters in the p-book that was digitized... well, um, maybe. but we have several independent indications that there are _actually_ 6, so you're missing two... and have others out of order... so even if this e-text matches some p-book, that p-book was botched. so your e-text is still an error. a different one, but an error nonetheless. and not only does al want to maintain that "we have no major errors", but he wants to put his "feelings" out there in the middle of the road and then complain when a car runs over them. this isn't personal, al. jose made his error-report on this e-text _over_three_years_ago_ now. long before you were ever a whitewasher. and it's not _your_ fault that nobody seems to have collected up that error-report. it's not your fault that p.g. doesn't have a public-and-transparent error-reporting system which you could have checked to find out if there were any error-reports on that e-text before you bequeathed it more legitimacy by reposting it. but it _is_ your fault, al, that you adopt this counterproductive attitude, and then tune out any dialog on errors. it doesn't make you look good. and it doesn't help project gutenberg -- at all -- when the people who have reserved the ability to correct errors unto themselves also want to maintain that there are no major errors in the library. it's just suicidal... -bowerbird p.s. i should've mentioned that jim tinsley also had a pretty good handle on the number of errors in a typical e-text, which he estimated as _50_: > http://www.pgdp.net/phpBB2/viewtopic.php?p=124624#124624 sadly, however, it was the "sticker-shock" from that revelation that soon led to d.p. deciding they needed to do "more rounds", sealing the deal on a long-contemplated change to separate proofing and formatting and -- eventually -- adding a 3rd round of proofing. as i have shown here, repeatedly, on book after book, those extra proofing rounds are largely (and sometimes even completely) unnecessary, if d.p. did pre-processing. oh yeah, although jim had a good handle on the real number of errors, he wasn't any better at tolerating a dialog on how best to find and fix 'em. ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Nov 5 13:13:05 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 5 Nov 2008 16:13:05 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: jose said: > Well, Bowerbird, if Al or you had done your "homework," > either one of you could have found both that Google Book Search > listing and the WorldCat listing in just a couple of minutes. :) how do you know i didn't? :+) but that's beside the point... a 4-chapter version of this book is a hatchet job... your arguments are very convincing on the matter. and they were convincing when you made them 3 years ago too. i never complained that the book was once posted. crap happens. i even said that, based on the producer, i believed he did it right... but once it'd been determined that this p-book was a hatchet-job, the e-text should have been pulled. or at least annotated as such. (but seriously, just pull it; why leave a botched book in the library?) more to the point, it should've been re-done with a good p-book... why wasn't this done? i dunno. it was discussed, over at d.p., in 2007, 18 months after your report: > http://www.pgdp.net/phpBB2/viewtopic.php?t=26280&start=15 > http://www.pgdp.net/phpBB2/viewtopic.php?t=26280&start=30 that effort fizzled, i guess... even if you don't re-do the book, though, pull that bad e-text. it's not like the severe problems with this e-text were unknown. they were widely acknowledged publicly, but fell through a crack. so -- when it came time to be "reposted" -- al just went and did it. like i said, not his fault. i blame it on a faulty infrastructure. you fix your infrastructure by paying attention to where it's broken. -bowerbird p.s. jose's record: signal: 2, noise: 3. ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajhaines at shaw.ca Wed Nov 5 13:36:46 2008 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Wed, 5 Nov 2008 13:36:46 -0800 Subject: [gutvol-d] philosophy towards errors in p.g. e-texts References: Message-ID: <000d01c93f8e$9aa57390$6401a8c0@ahainesp2400> Actually, I was not Kensington 1332's reposter. It was David Widger, and at the time of the reposting, he was as unknowing of 1332's history as I was. If you want two six-chapter versions of Kensington Gardens, look no further than 26998 (derived from the Scribner edition) and 26999 (from the Hodder & Stoughton edition). ----- Original Message ----- From: Bowerbird at aol.com To: gutvol-d at lists.pglaf.org ; Bowerbird at aol.com Sent: Wednesday, November 05, 2008 1:13 PM Subject: Re: [gutvol-d] philosophy towards errors in p.g. e-texts jose said: > Well, Bowerbird, if Al or you had done your "homework," > either one of you could have found both that Google Book Search > listing and the WorldCat listing in just a couple of minutes. :) how do you know i didn't? :+) but that's beside the point... a 4-chapter version of this book is a hatchet job... your arguments are very convincing on the matter. and they were convincing when you made them 3 years ago too. i never complained that the book was once posted. crap happens. i even said that, based on the producer, i believed he did it right... but once it'd been determined that this p-book was a hatchet-job, the e-text should have been pulled. or at least annotated as such. (but seriously, just pull it; why leave a botched book in the library?) more to the point, it should've been re-done with a good p-book... why wasn't this done? i dunno. it was discussed, over at d.p., in 2007, 18 months after your report: > http://www.pgdp.net/phpBB2/viewtopic.php?t=26280&start=15 > http://www.pgdp.net/phpBB2/viewtopic.php?t=26280&start=30 that effort fizzled, i guess... even if you don't re-do the book, though, pull that bad e-text. it's not like the severe problems with this e-text were unknown. they were widely acknowledged publicly, but fell through a crack. so -- when it came time to be "reposted" -- al just went and did it. like i said, not his fault. i blame it on a faulty infrastructure. you fix your infrastructure by paying attention to where it's broken. -bowerbird p.s. jose's record: signal: 2, noise: 3. ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from-aol-search/?ncid=emlcntussear00000001) ------------------------------------------------------------------------------ _______________________________________________ gutvol-d mailing list gutvol-d at lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Nov 5 14:13:26 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 5 Nov 2008 17:13:26 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: al said: > Actually, I was?not?Kensington 1332's?reposter.? It was David Widger, and > at the time of the reposting, he was as unknowing of 1332's history as I was. doesn't matter who reposted it. my complaint has never been with volunteers. i love all of the volunteers. every one. including you. especially david widger. the problem is with the infrastructure. (and yes, i fully realize that the infrastructure was built by volunteers, and continues to be retained by volunteers, but i separate them in my mind...) > If you want two six-chapter versions of Kensington Gardens, > look no further than 26998 (derived from the Scribner edition)? > and 26999 (from the Hodder & Stoughton edition). perhaps you didn't read it, but i commented when these e-texts were posted. these e-texts don't answer the question about why the book wasn't re-done earlier, either in october of 2005 when jose posted the original error-report, or in april of 2007 when it bubbled to the surface at distributed proofreaders. it would be useful in correcting the _workflow_ to examine those breakdowns. furthermore, these e-texts do not answer the question about why an e-text which represents a botched p-book is still online. it badly needs to be pulled. because as long as it's online, it gives credence to the "bad apples" sabotage. and again, the real issue is not any one particular e-text. it is _the_workflow_. hundreds of subscribers to the bookpeople listserve knew about this problem. how is it the case that david widger didn't know? please answer that question. it doesn't have to be an answer to this list, doesn't have to be an answer to me. but it _does_ need to be an answer to you, and the rest of the p.g. bureaucracy. your workflow is flawed. you need to fix it. if you won't engage in dialog with those of us who see that it is flawed, and have ideas about how you can fix it, then you need to do it yourselves. but however it's done, it needs to be done. and the longer you protract it, instead of just responding, "yeah, you're right, and we'll fix that", or even just "we will fix it", the worse it makes you look, al... -bowerbird ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Thu Nov 6 12:35:02 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 6 Nov 2008 15:35:02 EST Subject: [gutvol-d] how to spot missing sentence-terminating punctuation Message-ID: over at d.p., cbgrf (a.k.a. chuck) said this: > in SmoothReading I found a number of instances of > sentences which ended without punctuation. This is > something very hard to find without actually reading > the entire work. Even then I think that the eye is > doomed to see what it wants to see. > > What I would like to find (perhaps it exists already) > is a file, usable within Guiguts, which would help find > lacking end-of-sentence punctuation. > > He went Now he was there. > > I am thinking that a file, which could be used at > the Stealth Scanno button within Guiguts, would > help by lining up all the instances... instead of just > a one-by-one search with a regex, which would > stop at every proper noun. > > In a line-up one could see instances of capital letters > preceded by no punctuation... sentences which begin > with pronouns, etc... (there would be a good chance > that these are errors.) > > Any ideas? > > http://www.pgdp.net/phpBB2/viewtopic.php?p=501400#501400 yeah, i got some ideas, chuck... :+) first, you really do need -- first thing -- to get a list of the proper names within the book, so you don't flag 'em, not just during this particular routine, but with _any_ one. and once you have such a list, of course, it's easy enough to discard them when they come up in this check as well... in pseudocode, find any lowercase-whitespace-uppercase, and discard if the uppercase is one of the proper names... *** what if you don't have that list of proper names in the book? well, _get_it_. how do you get it? well, there are a lot of ways, but one of the best is to reverse-engineer what you just did... in pseudocode, _look_ for lowercase-whitespace-uppercase, and then collect all the uppercase words. voila, list of names. is it a perfect list? no, not quite, but you'll find it's very good. and many of the "problems with this list of names" are exactly what you were looking for in the first place, i.e., places where the sentence-terminating punctuation was missed by the o.c.r. so you've ended up locating them anyway... *** if you're still loathe to collect that list of names, then i feel sorry for your laziness, since in the end it means you must work _harder_ than you would have had to work otherwise, but i will still give you yet another tip, on doing without it... in pseudocode, find lowercase-whitespace-uppercase, then check the uppercase word in your dictionary. if it's in there, it's probably not a name, and you should definitely flag that. some names are indeed valid words, so there's false-alarms, but you shouldn't miss any of the errors you are looking for. so all you needed was a dictionary, one without names in it. if your dictionary includes names in it, use this one instead: > http://z-m-l.com/go/regulardictionary.txt i'm happy to help... -bowerbird p.s. a search for other punctuation (e.g., comma, semicolon) followed by whitespace and then uppercase is also a good way to create your list of names, so include them in your search too. ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebooks at ibiblio.org Thu Nov 6 21:10:43 2008 From: ebooks at ibiblio.org (Jose Menendez) Date: Fri, 07 Nov 2008 00:10:43 -0500 Subject: [gutvol-d] philosophy towards errors in p.g. e-texts In-Reply-To: <000d01c93f8e$9aa57390$6401a8c0@ahainesp2400> References: <000d01c93f8e$9aa57390$6401a8c0@ahainesp2400> Message-ID: <4913CDD3.7040404@ibiblio.org> On Nov. 5, 2008, Al Haines wrote: > If you want two six-chapter versions of Kensington Gardens, look no > further than 26998 (derived from the Scribner edition) and 26999 (from > the Hodder & Stoughton edition). You may recall that in my first gutvol-d post about PG's original version of "Peter Pan in Kensington Gardens," I mentioned that Hodder & Stoughton edition at the Internet Archive. I wrote I would have preferred using the Hodder & Stoughton edition because it was published with 50 color plates vs. only 16 in the Scribner's, but several of the plates were missing from the H & S copy the IA scanned. I'm a bit surprised that you made an ebook from it anyway, despite the missing plates, especially since you also made one from the Scribner's edition. Even if I agree that it's worthwhile to make ebooks from both editions, I don't see much point in making an ebook from a defective copy. By the way, those missing color plates aren't the only problem with the IA's scanned copy of that Hodder & Stoughton edition. I neglected to mention that when it was scanned, they skipped a 2-page spread, namely the blank back of one of the color plates and page 19 of the text. Take a look at this sentence in your ebook: http://www.gutenberg.org/dirs/2/6/9/9/26999/26999-h/26999-h.htm#chap02 Perhaps she has forgotten, just as * distinctly remembered a youthful desire to return to the tree-tops, and with that memory came others, as that he had lain in bed planning to escape as soon as his mother was asleep, and how she had once caught him half-way up the chimney. Where I inserted the asterisk is where page 19 is missing. So instead of having only one incomplete, defective copy of the book, PG now has two. :) Jose Menendez P.S. You can see how much text is missing by looking at the ebook you made from the Scribner's edition: http://www.gutenberg.org/dirs/2/6/9/9/26998/26998-h/26998-h.htm#chap02 Perhaps she has forgotten, just as she sometimes forgets your name and calls you Mildred, which is your mother's name. Still, she could hardly forget such an important thing as the goat. Therefore there was no goat when your grandmother was a little girl. This shows that, in telling the story of Peter Pan, to begin with the goat (as most people do) is as silly as to put on your jacket before your vest. Of course, it also shows that Peter is ever so old, but he is really always the same age, so that does not matter in the least. His age is one week, and though he was born so long ago he has never had a birthday, nor is there the slightest chance of his ever having one. The reason is that he escaped from being a human when he was seven days old; he escaped by the window and flew back to the Kensington Gardens. If you think he was the only baby who ever wanted to escape, it shows how completely you have forgotten your own young days. When David heard this story first he was quite certain that he had never tried to escape, but I told him to think back hard, pressing his hands to his temples, and when he had done this hard, and even harder, he distinctly remembered a youthful desire to return to the tree-tops, and with that memory came others, as that he had lain in bed planning to escape as soon as his mother was asleep, and how she had once caught him half-way up the chimney. From Bowerbird at aol.com Fri Nov 7 10:40:59 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 7 Nov 2008 13:40:59 EST Subject: [gutvol-d] philosophy towards errors in p.g. e-texts Message-ID: whoa. that's mind-boggling. you do two different versions of a book, one after the other, and you don't even do a comparison of one against the other? wow. you obviously don't recognize the power of such a comparison in helping to scoot _both_ copies in their march to perfection... it didn't even _occur_ to me to check that you'd done that step. but if you miss a full page out of one book, and not the other, then you certainly haven't done that comparison... so i did it... in addition to the fact that that one page was missing -- would that be 1 error for 1 missing page, or 24 errors for 24 missing lines, or 222 errors for 222 missing words, or 902 errors for 902 missing characters, how do you count it? -- i find 45 other places where the two books are different. sure, some of them might be edition differences, but... ...well, wouldn't it be nice if you could state that explicitly? and a few differences -- at least like this one -- are obviously errors: > Yon can't think how pleased Peter was to learn that all the people > You can't think how pleased Peter was to learn that all the people whenever a very close-in analysis on p.g. errors is done, the results usually end up being worse than i expected... that's very troubling... and it's not that the errors are _significant_... because they aren't... (well, ok, a missing page, that's significant. missing chapters as well. but most of the errors in most of the e-texts are not too significant.) but even if the errors aren't significant, they're sloppy. unnecessary. it's time for d.p. and the whitewashers to finally concede that they need the help of a million eyeballs to fix all of the errors, and to actively create the infrastructure that would allow that... -bowerbird p.s. jose's ratio: 3 signal, 3 noise. ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 11 00:01:01 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 11 Nov 2008 03:01:01 EST Subject: [gutvol-d] 5-year anniversary -- happy veterans day Message-ID: happy veterans day. it's the 5-year anniversary of my joining this listserve... one of the first things i did upon joining -- after reading back a couple years in the archives, of course -- was to state my opinion that project gutenberg would not get volunteer uptake on the t.e.i. direction that was planned, because that heavy markup route is just too complicated. so, how did that all turn out, anyway? :+) -bowerbird ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 11 15:45:16 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 11 Nov 2008 18:45:16 EST Subject: [gutvol-d] further report on "peter pan in kensington gardens" Message-ID: let?s do some more work on ?peter pan in kensington gardens?. i found 45 differences between the two versions that were posted ? #26998 and #26999, for those of you who?re keeping track here. of course, every single one of them _might_ have been a difference between the two editions, so i checked each against each p-book... the good news is that, for #26999, there were only _2_ ?errors? ? i.e., places where the p.g. e-text didn?t match what was printed: > 'I think I shall go back to mother,' he said, timidly. > 'I think I shall go back to mother,' he said timidly. > Peter was a just master, and paid his work-people every evening. > Peter was a just master, and paid his workpeople every evening. *** the bad news is that, for #26988, there were 19 such ?errors?: > hair cut. When David shed his curls at the hair-dressers, I am told, > hair cut. When David shed his curls at the hairdresser?s, I am told, > hair cut. When David shed his curls at the hair-dressers, I am told, > hair cut. When David shed his curls at the hairdresser?, I am told, > and the chaffinche's nest, but we pretend not to know what the Dog's > and the chaffinches nest, but we pretend not to know what the Dog's > 'I suppose,' said Peter huskily, 'I suppose I can still fly.' > 'I suppose,' said Peter huskily, 'I suppose I can still fly?' > glory of Peter as he saw it growing more and more like a great thrushes > glory of Peter as he saw it growing more and more like a great thrush's > the Thrushes Nest. When he sails, he sits down, but he stands up to > the Thrush's Nest. When he sails, he sits down, but he stands up to > least, what use was she to him! You can be sorry for him for that, but > least, what use was she to him? You can be sorry for him for that, but > among humans also; and that is why they are often made uneasy when they > among humans also, and that is why they are often made uneasy when they > grateful little people, too, and at the princesses coming-of-age ball > grateful little people, too, and at the princess's coming-of-age ball > Maimie, don't' and pulls the sheet over his head. 'It is coming > Maimie, don't!' and pulls the sheet over his head. 'It is coming > and he flapped his arms vigorously just as the cab-men do before they > and he flapped his arms vigorously just as the cabmen do before they > 'What's this.' he cried, and first he shook the heart like a watch, and > 'What's this?' he cried, and first he shook the heart like a watch, and > She could n't help it. She was crazy with delight over her little > She couldn't help it. She was crazy with delight over her little > suggestion of the doctors, but the only thing they could think of that > suggestion of the doctor's, but the only thing they could think of that > 'before hot and cold are put in.' and he put in hot and cold. Then an > 'before hot and cold are put in?' and he put in hot and cold. Then an > Yon can't think how pleased Peter was to learn that all the people > You can't think how pleased Peter was to learn that all the people > Just then they heard a grating creak, followed by _creak, creak_, all > Just then they heard a grating _creak_, followed by _creak, creak_, all *** of course, some of these _differences_ between the p-book and the e-text could well have been e-book _corrections_ that were made to the p-book... *** this means _26_ differences occurred between the two different editions... these can be of interest, in that many of them seem to represent corrections made in the later edition (#26999) to errors in the earlier (#26998) edition. indeed, there are quite a few cases that seem to fit this description nicely... moreover, _all_26_ could be evaluated in this light, leading to many more _corrections_ being made to #26998. if you?re correcting errors, why not fix all of them? *** all in all, though, if we use the first set of results, then one of these books, #26999, was done very well, with just 2 errors present, both of them trivial. of course, this was the e-text where an entire page was missing, so you can temper this praise accordingly if you wish. the other book, #26998, had 17 errors, and thus wasn?t _quite_ as good, but since none were servious, still manages to squeak by with an acceptable rate. however, since both of these e-texts were digitized at the same time, it was _not_ good workflow to fail to compare them to each other, since that would have brought _both_ of the e-texts closer to perfection than they currently are. -bowerbird ************** AOL Search: Your one stop for directions, recipes and all other Holiday needs. Search Now. (http://pr.atwola.com/promoclk/100000075x1212792382x1200798498/aol?redir=http://searchblog.aol.com/2008/11/04/happy-holidays-from -aol-search/?ncid=emlcntussear00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 14 14:01:42 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 14 Nov 2008 17:01:42 EST Subject: [gutvol-d] already doing it right there Message-ID: there's a thread over at d.p. that asks if pg/dp is still relevant/useful, given that google is scanning millions of books... it's a rather good discussion, as this discussion typically goes. but... juliet says this: > What Google and OCA can't afford to do is correct the text. > What we do is simply too labor intensive. > http://www.pgdp.net/phpBB2/viewtopic.php?p=502178#502178 maybe juliet hasn't been absorbing my research, but i've demonstrated rather clearly that o.c.r.-correction is _not_ a thing that requires "labor". _intelligent_programmatic_processing_ is what's needed to correct o.c.r. the quality of o.c.r. these days -- yes, even on old books -- is such that you can get fairly good recognition. then the regular clean-up routines, requiring nothing out of the ordinary, will improve the text considerably. after that, you go into the advanced routines, which get it down to about an average of one error per page. then you look at multiple digitizations, resolving differences between them, such that you approach perfection... and finally, incorporating corrections input by your users is all you need. in other words, there's nothing here that google can't do. nothing at all. indeed, there's nothing here _anyone_ should find difficult to implement. if juliet familiarized herself with what's happening within her own project, she'd see that people like dkretz and rfrank are already doing it right there. the best use of people who are volunteering their services to d.p. would be to fine-tune those routines, rather than proofing every word on every page. ironically, this would also be far less "laborious"... work smarter, not harder. -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Sat Nov 15 03:59:39 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Sat, 15 Nov 2008 12:59:39 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: Hi BB, All, Am 14.11.2008 um 23:01 schrieb Bowerbird at aol.com: > there's a thread over at d.p. that asks if pg/dp is still relevant/ > useful, > given that google is scanning millions of books... > > it's a rather good discussion, as this discussion typically goes. > but... > > juliet says this: > > What Google and OCA can't afford to do is correct the text. > > What we do is simply too labor intensive. > > http://www.pgdp.net/phpBB2/viewtopic.php?p=502178#502178 > > maybe juliet hasn't been absorbing my research, but i've demonstrated > rather clearly that o.c.r.-correction is _not_ a thing that > requires "labor". Though I basically agree with you say belong, yet the natural langauge processing still requires some human intervention. What Juliet is saying Google is not willing to pay for the needed human-power. But, then again you could ask Google if they willing to hire you. regards Keith. > > > _intelligent_programmatic_processing_ is what's needed to correct > o.c.r. > > the quality of o.c.r. these days -- yes, even on old books -- is > such that > you can get fairly good recognition. then the regular clean-up > routines, > requiring nothing out of the ordinary, will improve the text > considerably. > > after that, you go into the advanced routines, which get it down to > about > an average of one error per page. then you look at multiple > digitizations, > resolving differences between them, such that you approach > perfection... > > and finally, incorporating corrections input by your users is all > you need. > > in other words, there's nothing here that google can't do. nothing > at all. > > indeed, there's nothing here _anyone_ should find difficult to > implement. > > if juliet familiarized herself with what's happening within her own > project, > she'd see that people like dkretz and rfrank are already doing it > right there. > > the best use of people who are volunteering their services to d.p. > would be > to fine-tune those routines, rather than proofing every word on > every page. > ironically, this would also be far less "laborious"... work > smarter, not harder. > > -bowerbird > > > > ************** > Get the Moviefone Toolbar. Showtimes, theaters, movie news & more! > (http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol? > redir=http://toolbar.aol.com/moviefone/download.html? > ncid=emlcntusdown00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sat Nov 15 11:09:33 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 15 Nov 2008 14:09:33 EST Subject: [gutvol-d] already doing it right there Message-ID: keith said: > still requires some human intervention.? i think i've demonstrated pretty clearly that when "humans" are doing the proofreading, they still do miss the occasional error or two. and i've also demonstrated pretty clearly that sophisticated routines (and sometimes even some rather unsophisticated ones) bring the bug-rate for a book down to an error or two. so i don't think you can justify this conclusion. and machines keep getting better and better, while the humans are pretty much stuck as is. of course, there are arenas (such as equations) where the o.c.r. essentially cannot be repaired. but it's not as if unknowledgeable humans can substantially improve those arenas much either. finally, in case you hadn't noticed, i also _did_ include "human intervention" in my work-flow, in the form of user-submitted error-reports... > Juliet is saying Google is not willing > to pay for the needed human-power. no, she's saying that it is a labor-intensive task. and i say it's not. and i have shown my research. the nature of my research is relatively obvious... (at least, the way i've gone about o.c.r.-cleaning is obvious, even if the _extent_ to which it works has proven to be rather surprising, even to me.) i believe it would be silly to think that google -- with its massive wealth and state-of-the-art lab -- won't come up with even better methodology. they have a tremendous corpus to work against. (but if anyone wants to bet against them, i'll bet!) besides, who says google will be unwilling to pay for "needed human-power" when the time comes? they are paying for the people to do the scanning, which costs much more, and the machines as well. so why not pay for people to do the final proofing? after all, it's not as if google is strapped for cash... so, asking whether google can correct the o.c.r. is the wrong question for us to be asking at this time. of course they can. crap, even _we_ can correct it... the question we should be asking is whether google intends to share the corrected o.c.r. with the public... -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Sun Nov 16 12:27:47 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Sun, 16 Nov 2008 21:27:47 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: Hi BB, All, As you have noted a human is needed! The corrections are not done automatically. What we are talking about is automatic text recognition. There is NO system to date will do this fully automated. I should know it is my field. Do not get me wrong. The Tools help and make things more efficient. We have been promised machine translation for the past twenty years. It still is not here, yet! Why would Google pay to have thier texts proofed and then not share it in some form. What a waist of resources. Do not you think that Google would have done this if they thought it viable. The text still has to be proofed finally by a human. That is the labor involed. Call it what you like, it still has to be done. regards Keith. Am 15.11.2008 um 20:09 schrieb Bowerbird at aol.com: > keith said: > > still requires some human intervention. > > i think i've demonstrated pretty clearly that > when "humans" are doing the proofreading, > they still do miss the occasional error or two. > > and i've also demonstrated pretty clearly that > sophisticated routines (and sometimes even > some rather unsophisticated ones) bring the > bug-rate for a book down to an error or two. > > so i don't think you can justify this conclusion. > > and machines keep getting better and better, > while the humans are pretty much stuck as is. > > of course, there are arenas (such as equations) > where the o.c.r. essentially cannot be repaired. > but it's not as if unknowledgeable humans can > substantially improve those arenas much either. > > finally, in case you hadn't noticed, i also _did_ > include "human intervention" in my work-flow, > in the form of user-submitted error-reports... > > > > Juliet is saying Google is not willing > > to pay for the needed human-power. > > no, she's saying that it is a labor-intensive task. > and i say it's not. and i have shown my research. > > the nature of my research is relatively obvious... > (at least, the way i've gone about o.c.r.-cleaning > is obvious, even if the _extent_ to which it works > has proven to be rather surprising, even to me.) > > i believe it would be silly to think that google -- > with its massive wealth and state-of-the-art lab > -- won't come up with even better methodology. > they have a tremendous corpus to work against. > (but if anyone wants to bet against them, i'll bet!) > > besides, who says google will be unwilling to pay > for "needed human-power" when the time comes? > they are paying for the people to do the scanning, > which costs much more, and the machines as well. > so why not pay for people to do the final proofing? > after all, it's not as if google is strapped for cash... > > so, asking whether google can correct the o.c.r. is > the wrong question for us to be asking at this time. > of course they can. crap, even _we_ can correct it... > > the question we should be asking is whether google > intends to share the corrected o.c.r. with the public... > > -bowerbird > > > > ************** > Get the Moviefone Toolbar. Showtimes, theaters, movie news & more! > (http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol? > redir=http://toolbar.aol.com/moviefone/download.html? > ncid=emlcntusdown00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sun Nov 16 13:32:30 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sun, 16 Nov 2008 16:32:30 EST Subject: [gutvol-d] already doing it right there Message-ID: keith said: > As you have noted a human is needed! ok, keith, pay attention, please, ok? the reason juliet says this task is labor-intensive is because her methodology calls for the checking of every single o.c.r. word against the page-scan. i say that's not necessary. and i've proven it... most o.c.r. text can be cleaned up _extensively_ (but not completely) via programmatic routines that do _not_ need to be attended by a human... furthermore, the more books you correct this way, the better your programmatic routines will become. and google has lotsa books to improve itself with. and after the automatic cleaning, you can turn to _comparison_ methodology. differences between two scan-sets of the same book can be resolved automatically often -- even more often than not. i haven't done any work on comparisons among _more_ than two scan-sets, but i would imagine that it will improve the accuracy-rate even more. and remember that, once they're finished, google will have more than one scan-set on many books. so they'll have this option available to them often. (there are a lot of books out there that are held in only one library, it's true. but those books aren't the ones that people will generally be looking for. thus, the accuracy of their o.c.r. is less important.) now, all of this is before _any_ human is involved. for many books, you can get the number of errors down to a handful, before ever involving a human. > I should know it is my field. well, excuse me for stepping on your expertise... but i've shown here -- with solid research -- that accuracy can be attained with little human input... > We have been promised machine translation > for the past twenty years. It still is not here, yet! you've got your threads mixed up. > Why would Google pay to have their texts proofed > and then not share it in some form. have you heard about the "settlement" of their suit? google is now trying to become a monopoly player... and a monopoly player does not share its resources. > Do not you think that Google would have > done this if they thought it viable. i'm not sure what you're talking about here... if you're asking why google isn't correcting their o.c.r. _right_now_, i would respond that they're doing work in their labs to accomplish this objective, and they're probably a lot further along than they are letting on... my guess is they can already get nearly-perfect text, after applying their clean-up routines to their o.c.r., with <2 minutes of human input on an average book. (i can do it with 20 minutes of work, and i'm guessing their system works 10 times better than mine, ergo...) > The text still has to be proofed finally by a human. you're wrong. a very high degree of accuracy can be obtained even without any human involvement at all. and humans make proofing errors too. which means that -- if you've already _got_ text with "a very high degree of accuracy", there'll be no guarantee that humans will improve it much. so, at some point, it's simply not cost-efficient to _pay_ humans to do the job. even if they _could_ find that last o.c.r. error. (and they often cannot.) besides, as i argued earlier, if google decides it _is_ worthwhile to them to _pay_ humans to do proofing, google certainly has enough money they can do that. but they could also go another route, and recruit the public-at-large to assist them in doing the proofing. that's what they're doing now, by soliciting reports on bugs in their work. google gets a lot of eyeballs. and we all know the cumulative power of volunteers. google could also use other tactics, like "recaptcha", to leverage input from humans without paying for it. so even if total accuracy _does_ require human input, there's no reason, at all, that google cannot attain it... > That is the labor involved. some errors can only be corrected by a human. some. is that the _concession_ that you're looking for, keith? i hope so, because this conversation is going nowhere. > Call it what you like, it still has to be done. no, it doesn't "have to be done". in fact, most people -- like juliet -- seem to believe that google is just going to settle for their raw o.c.r. i think that's silly. but they _could_. since no other search engine has the same corpus, they don't have any competition... so nothing could _force_ them to correct their o.c.r. thus, they _could_ settle for raw o.c.r. but they won't... -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Mon Nov 17 02:18:30 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Mon, 17 Nov 2008 11:18:30 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <0F6575E3-9CBD-4785-8DA6-9E7C46E43656@uni-trier.de> Hi BB, All, Evidently, yyou are not paying attention. 1) With different scans WHO decides which one is correct. Yes, with more than two scans you could use a majority takes all decision. But who says the majority is not flawed. 2) Everything you have mentioned here the errors are shown and found. But does your routine actually change the error without consulting the person running the routines. 3) Like I said their will be at least one time where the "comparison" method must be used! That requires at least one human doing it. As You have mentioned often enough more are better. THAT IS THE LABOR INVOLVED. 4) Computer linguigist know all about your methods, and even more. If things were that simple and efficient commercial products would be using them already. The methods have been around for some 20 years. If had mentioned machine translation before. The methods work well for a small subset of the language, but do not for a language as a whole. Language is basically type 1 and type 2. 5) It takes you just ten minutes to read a WHOLE BOOK. Lucky you. 6) The method you have mentioned are worth millions to the publishers, too, if they where that great! Get back to reality. Am 16.11.2008 um 22:32 schrieb Bowerbird at aol.com: > keith said: > > As you have noted a human is needed! > > ok, keith, pay attention, please, ok? > > the reason juliet says this task is labor-intensive > is because her methodology calls for the checking > of every single o.c.r. word against the page-scan. > > i say that's not necessary. and i've proven it... > > most o.c.r. text can be cleaned up _extensively_ > (but not completely) via programmatic routines > that do _not_ need to be attended by a human... > > furthermore, the more books you correct this way, > the better your programmatic routines will become. > and google has lotsa books to improve itself with. > > and after the automatic cleaning, you can turn to > _comparison_ methodology. differences between > two scan-sets of the same book can be resolved > automatically often -- even more often than not. > > i haven't done any work on comparisons among > _more_ than two scan-sets, but i would imagine > that it will improve the accuracy-rate even more. > > and remember that, once they're finished, google > will have more than one scan-set on many books. > so they'll have this option available to them often. > > (there are a lot of books out there that are held in > only one library, it's true. but those books aren't > the ones that people will generally be looking for. > thus, the accuracy of their o.c.r. is less important.) > > now, all of this is before _any_ human is involved. > > for many books, you can get the number of errors > down to a handful, before ever involving a human. > > > > I should know it is my field. > > well, excuse me for stepping on your expertise... > > but i've shown here -- with solid research -- that > accuracy can be attained with little human input... > > > > We have been promised machine translation > > for the past twenty years. It still is not here, yet! > > you've got your threads mixed up. > > > > Why would Google pay to have their texts proofed > > and then not share it in some form. > > have you heard about the "settlement" of their suit? > google is now trying to become a monopoly player... > > and a monopoly player does not share its resources. > > > > Do not you think that Google would have > > done this if they thought it viable. > > i'm not sure what you're talking about here... > > if you're asking why google isn't correcting their o.c.r. > _right_now_, i would respond that they're doing work > in their labs to accomplish this objective, and they're > probably a lot further along than they are letting on... > > my guess is they can already get nearly-perfect text, > after applying their clean-up routines to their o.c.r., > with <2 minutes of human input on an average book. > > (i can do it with 20 minutes of work, and i'm guessing > their system works 10 times better than mine, ergo...) > > > > The text still has to be proofed finally by a human. > > you're wrong. a very high degree of accuracy can be > obtained even without any human involvement at all. > > and humans make proofing errors too. > > which means that -- if you've already _got_ text > with "a very high degree of accuracy", there'll be > no guarantee that humans will improve it much. > > so, at some point, it's simply not cost-efficient to > _pay_ humans to do the job. even if they _could_ > find that last o.c.r. error. (and they often cannot.) > > besides, as i argued earlier, if google decides it _is_ > worthwhile to them to _pay_ humans to do proofing, > google certainly has enough money they can do that. > > but they could also go another route, and recruit the > public-at-large to assist them in doing the proofing. > that's what they're doing now, by soliciting reports > on bugs in their work. google gets a lot of eyeballs. > and we all know the cumulative power of volunteers. > > google could also use other tactics, like "recaptcha", > to leverage input from humans without paying for it. > > so even if total accuracy _does_ require human input, > there's no reason, at all, that google cannot attain it... > > > > That is the labor involved. > > some errors can only be corrected by a human. some. > > is that the _concession_ that you're looking for, keith? > > i hope so, because this conversation is going nowhere. > > > > Call it what you like, it still has to be done. > > no, it doesn't "have to be done". > > in fact, most people -- like juliet -- seem to believe > that google is just going to settle for their raw o.c.r. > > i think that's silly. > > but they _could_. since no other search engine has > the same corpus, they don't have any competition... > so nothing could _force_ them to correct their o.c.r. > > thus, they _could_ settle for raw o.c.r. but they won't... > > -bowerbird > > > > ************** > Get the Moviefone Toolbar. Showtimes, theaters, movie news & more! > (http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol? > redir=http://toolbar.aol.com/moviefone/download.html? > ncid=emlcntusdown00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Nov 17 09:03:47 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 17 Nov 2008 12:03:47 EST Subject: [gutvol-d] already doing it right there Message-ID: keith said: > Evidently, yyou are not paying attention. well, keith, at least you're a scrapper, not a whiner... :+) but hey, i've summoned up all kinds of data to support my arguments, real-life data from actual digitizations, the vast majority from books which i did not pre-select, while you come here only with your opinions. perhaps you have this list confused with the list of years back? because the ground-rules have changed; you need data. > With different scans WHO decides which one is correct. the programmatic rules -- derived from experimentation that tests which versions of those rules give best results -- are _what_ decides which of the choices we'd choose. (which, of course, doesn't mean it is the _correct_ choice.) but, in most cases, it's fairly obvious which one to choose. for instance, if a line differs between the two digitizations, with one consisting of words which are in the dictionary, and the other containing one or more words that are not, we'd go with the former. that's the most common case, which is why i mention it first, but there are several other cases which are equally clear when it comes to resolution. but again, your concern about which one is "correct" is leading up a blind alley. the program chooses the option most likely to be correct, given its understanding of the choice in front of it, while _knowing_ it might be wrong. the fact that we actually end up with the "correct" answer the vast majority of the time is what makes this _efficient_, and that efficiency is the main reason we purse this path... you need to get over the fear of making the wrong choice. it will happen sometimes. we do not expect _perfection_. as long as we save a lot of time, we're happy and satisfied. > with more than two scans you could use a majority > takes all decision. But who says the majority is not flawed. a "majority takes all" rule is rather blunt-force. you _might_ want to resort to that, but only after exhausting other rules. nonetheless -- as with _all_ of the rules we would follow -- there is some probability that a decision would be "wrong". your majority can be flawed -- it almost certainly _will_ be on some occasions -- but we don't let that fact cripple us... as long as any rule leads to a _correct_decision_ more often than not -- preferably _far_ more often than not -- then we have a rule that we can use. further, the better its accuracy, the more we use it. when it chooses the wrong alternative, that just means we have an error we have to fix downstream. it's not the end of the world. > Everything you have mentioned here the errors are shown > and found. But does your routine actually change the error > without consulting the person running the routines. oh please. how many times do we have to answer that question? there are some errors that can be fixed without any monitoring. others need to be checked. and it runs the gamut in-between... when you get down to it, the cases are pretty easy to determine. other than that, i refuse to get on this merry-go-round yet again. > Like I said their will be at least one time where > the "comparison" method must be used! where is your data? > That requires at least one human doing it. much of the comparison process can be done by the machine. > As You have mentioned often enough more are better. > THAT IS THE LABOR INVOLVED. even in less-frequent situations where a human _is_ necessary, the time required is very minimal. you seem to be under the mistaken impression that i have said absolutely no human labor is needed. that's a ridiculous position -- you should feel embarrassed for trying to ascribe it to me -- but my position is clear on this. i was taking issue with _juliet_, who seems to think that every single word needs to be compared to the scan, and not just once but _multiple_ times, and who thus argues that this is a labor-intensive task. it _is_ labor-intensive if you buy her bloated workflow, but i say that that's unnecessary, and thus that the amount of labor required is _considerably_less_ than what she believes it to be. and i've got data to back me up... before you write another response, read that last paragraph again. if you want to argue with me, that's the position you need to refute. and remember, you need to do it with _data_, not with _assertions_. > Computer linguigist know all about your methods, and even more. well, it's a good thing we don't depend on them to correct our o.c.r. > It takes you just ten minutes to read a WHOLE BOOK. my point is that we don't have to _read_ the book to correct the o.c.r. > The method you have mentioned are worth millions to the > publishers, too, if they where that great! Get back to reality. data _is_ reality. i've shown mine. where is yours? -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoc lk/100000075x1212774565x1200812037/aol?redir=http://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Nov 17 09:18:48 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 17 Nov 2008 12:18:48 EST Subject: [gutvol-d] speaking of data -- mountain blood -- once again Message-ID: remember back during the month of july, when i shared a clean-up rule every day, showing how "mountain blood", a book that was then being proofed over at d.p., could have been cleaned up very quickly -- in minutes -- to a nearly-perfect state? well, you might now be happy to learn that that project was recently "skipped" over f2, and thus has moved on to post-processing. thus is the slow and laborious process at d.p. -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebooks at ibiblio.org Mon Nov 17 12:38:56 2008 From: ebooks at ibiblio.org (Jose Menendez) Date: Mon, 17 Nov 2008 15:38:56 -0500 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <4921D660.5090000@ibiblio.org> On Nov. 15, 2008 Bowerbird wrote: > i think i've demonstrated pretty clearly that > when "humans" are doing the proofreading, > they still do miss the occasional error or two. > > and i've also demonstrated pretty clearly that > sophisticated routines (and sometimes even > some rather unsophisticated ones) bring the > bug-rate for a book down to an error or two. [snip] > the nature of my research is relatively obvious... > (at least, the way i've gone about o.c.r.-cleaning > is obvious, even if the _extent_ to which it works > has proven to be rather surprising, even to me.) Unfortunately, Bowerbird, your boasting is exceeded by your bungling. I would have posted the following reply earlier, but I was hoping that someone else on this list would check out your claims and save me the trouble of typing it. You know how much I hate typing. :) _________________________________________________________ In this post back on Oct. 6th, [gutvol-d] "jean of the lazy a" -- 006 http://lists.pglaf.org/private.cgi/gutvol-d/2008-October/009241.html you wrote: > we're resolving the p.g./archive.org differences in "jean of the lazy a". > > http://www.gutenberg.org/files/538/538.txt > > http://www.archive.org/details/jeanoflazy00boweiala > > *** > > ok, i've made my last pass at this book -- and found two more text errors: > > > some old lady in the house gabbling and gossiping. I'm not the least > > some old lady in the house gabbing and gossiping. I'm not the least > > http://z-m-l.com/go/jeana/jeanap206.html > > > "What you going to call it a The Perils of the Prairie, say?" Burns > > "What you going to call it? The Perils of the Prairie, say?" Burns > > http://z-m-l.com/go/jeana/jeanap226.html The first "text error" you listed isn't an error. The scan of page 206 clearly shows "gabbling." And if you consult a good dictionary, you'll see that "gabble" is a legitimate word. For instance, see this Merriam-Webster entry: http://www.merriam-webster.com/dictionary/gabble > my new version is greatly improved over my previous versions: > > http://z-m-l.com/go/jeana/jeanap123.html > > http://z-m-l.com/go/jeana/jeana.zml > > this version might still have some errors -- even a whole lot of 'em -- > involving missing single-quotes. the o.c.r. from archive.org missed > some, i know, and might have missed lots more. (i don't know if this > was because the o.c.r. never saw them, or if a glitch in their workflow > _lost_ them, but either way, the end-result ends up being the same.) > > i worked hard enough restoring the end-line-hyphens they had lost, > so i decided i wouldn't even bother to check the single-quote-marks, > since i can't think of a way to write a routine to automate that check... > but other than that, this text is _perfect_, as far as i am concerned... > so jose, have at it... you can blame me for any errors that you locate. So is your text "perfect" other than errors "involving missing single-quotes"? Not even close. For example: out on the fat of its side in the sun, sound asleep. The out on the flat of its side in the sun, sound asleep. The http://z-m-l.com/go/jeana/jeanap003.html the place empty of her cheerful presence. Be looked the place empty of her cheerful presence. He looked http://z-m-l.com/go/jeana/jeanap004.html him. He'll do it, too, take it from me, Crofty is shore him. He'll do it, too, take it from me; Crofty is shore http://z-m-l.com/go/jeana/jeanap004.html shed where the youngest calf slept beside its mother, shed where the youngest calf slept beside its mother. http://z-m-l.com/go/jeana/jeanap026.html felt heavy and stupid; and the last cigarette he lighted; felt heavy and stupid; and the last cigarette he lighted http://z-m-l.com/go/jeana/jeanap027.html and gone off to water Jean's dowers. He was positive and gone off to water Jean's flowers. He was positive http://z-m-l.com/go/jeana/jeanap027.html "We're taking the long way round," he observed "We're taking the long way round," he observed, http://z-m-l.com/go/jeana/jeanap036.html "Well," she said at length, "turn your backs, you've "Well," she said at length, "turn your backs; you've http://z-m-l.com/go/jeana/jeanap056.html hollow in mind. If they could pull through there with hollow in mind. If they could pull through there with- http://z-m-l.com/go/jeana/jeanap075.html "I always supposed that fat men were essentially; "I always supposed that fat men were essentially http://z-m-l.com/go/jeana/jeanap095.html der, but it dropped instead to his coat pocket and fum der, but it dropped instead to his coat pocket and fum- http://z-m-l.com/go/jeana/jeanap098.html would probably have found them extremely common would probably have found them extremely common- http://z-m-l.com/go/jeana/jeanap106.html facts, and all the nagging-" facts, and all the nagging--" http://z-m-l.com/go/jeana/jeanap108.html self, he probably did not suspect that there was any- self, he probably did not suspect that there was any http://z-m-l.com/go/jeana/jeanap122.html (Now, Bowerbird, you may try to claim that your hyphen after "any" is correct because the first word on the next line is "one." In this old book, however, "anyone" isn't used a single time. But "any one" occurs in 14 lines, and you didn't change any of them into "anyone.") It vas very romantic, very mysterious, she told A. It was very romantic, very mysterious, she told http://z-m-l.com/go/jeana/jeanap125.html (Here you not only failed to correct the "vas," but you messed up the line break. You got the line breaks wrong a number of times. Shocking, considering how much you talk about the importance of line breaks.) edly. "I'm not accustomed to working under two dir- edly. "I'm not accustomed to working under two di- http://z-m-l.com/go/jeana/jeanap136.html Lee Milligan was the drowning man!, and the agony of Lee Milligan was the drowning man! and the agony of http://z-m-l.com/go/jeana/jeanap146.html She was up in the saddle and gone in a flurry of dusts She was up in the saddle and gone in a flurry of dust, http://z-m-l.com/go/jeana/jeanap150.html "Oh, you'll do," chuckled Robert Grant Burns, "Oh, you'll do," chuckled Robert Grant Burns. http://z-m-l.com/go/jeana/jeanap159.html and being careful to give no hint of that belief to any- and being careful to give no hint of that belief to any http://z-m-l.com/go/jeana/jeanap171.html ward and listen, and look, -- how far can she turn, Pete; ward and listen, and look, -- how far can she turn, Pete, http://z-m-l.com/go/jeana/jeanap181.html palms, and her elbows on her knees. Vague shadows; palms, and her elbows on her knees. Vague shadows http://z-m-l.com/go/jeana/jeanap223.html ten per cent, they ought to pay me quite a lot more than ten per cent. they ought to pay me quite a lot more than http://z-m-l.com/go/jeana/jeanap228.html I'm taking you home with me in obedience to my wife's, I'm taking you home with me in obedience to my wife's http://z-m-l.com/go/jeana/jeanap241.html almost entrancing, soft drawl to her voice and a most a most entrancing, soft drawl to her voice and a most http://z-m-l.com/go/jeana/jeanap270.html called to him on the range, in Montana "Hello, called to him on the range, in Montana. "Hello, http://z-m-l.com/go/jeana/jeanap277.html stay and face-things. I -- I have felt as if I could stay and face -- things. I -- I have felt as if I could http://z-m-l.com/go/jeana/jeanap284.html the day before?" she asked abruptly. "He wasn't -- the day before?" she asked abruptly. "He wasn't http://z-m-l.com/go/jeana/jeanap301.html ground his bitter mouth; pale with the tragic prison around his bitter mouth; pale with the tragic prison http://z-m-l.com/go/jeana/jeanap320.html Well, Bowerbird, there's a partial list of the errors in your "perfect" text--none of them "involving missing single-quotes." Now here's an error involving a single quotation mark that wasn't missing: 'Wasn't anything to tell -- till there was something "Wasn't anything to tell -- till there was something http://z-m-l.com/go/jeana/jeanap315.html Jose Menendez From Bowerbird at aol.com Mon Nov 17 16:33:23 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 17 Nov 2008 19:33:23 EST Subject: [gutvol-d] already doing it right there Message-ID: this is why i invite jose to check my work. he's got sharp eyes, and he clearly takes great delight in finding mistakes i've made. nothing like a motivated enemy to keep you straight... :+) and jose, i'm gonna give you a "signal" judgment on this post, but not without noting that you've been grossly dishonest here. and i'll explain why... i'm typically extremely clear on the _criterion_ that i am using, in accordance with the nature of the research that i am doing... on this particular book, i was comparing the reposted p.g. e-text with the uncorrected o.c.r. text obtained from the internet archive. in other words, i was comparing two flawed texts to each other, to see if a comparison would elucidate problems between them. the comparison did just that, enabling me to locate several bugs in the p.g. e-text. i also found problems with the o.c.a. version... that was the nature of the test. i described my text as "perfect" using the _criterion_ of the text resulting from the merge of the two different texts, both of which, it so happens, were flawed... thus, the merged text was also flawed. big deal. i did the comparison, and that's all. i didn't do any other checks on the text after merging, or i would've found some of the errors. the point of that series was to make the point that comparison is a methodology that al should have used to get better results. nothing in your post, jose, provides counterargument to _that_. *** one weakness -- which i have always acknowledged explicitly -- of the comparison methodology is a "blindness" when it comes to any situation where there is an identical error in both of the texts. and i'm assuming that that's what happened here... i might have made some errors doing the comparison. it happens. but my guess is that the vast majority of the errors jose pointed out in "my" version of the text are ones that exist in the p.g. e-text too, which is precisely why the comparison didn't isolate those errors... in fact, i'd be willing to bet money on it. i'll give you ten bucks for every one of your errors that is _not_ in the p.g. e-text, jose, if you will give me ten bucks for every one that _is_ in the p.g. e-text... so, do we have a bet? or not? no, i didn't think so. not so haughty once the _whole_ truth is out there, are you? *** dishonesty is never a good policy, jose... so i called you on yours... it's especially bad because i suspect you knew what you were doing, and you chose to do it anyway. that's not a credit to your character. on the other hand, i'll also give you credit for proofing this e-text. you caught a number of errors there, assuming that they are right -- and we all know that i value your proofing abilities very highly -- and that's quite good. i certainly will take advantage of it for my text. i suggest that the p.g. whitewashers take advantage for their text too. (indeed, al, i'd think you'd want to take a very close look at this report, since i would assume you _did_ run that text through all your checks.) so jose, that's why i'll give you credit for "signal" for this post, not "noise". but the next time you come in with a smear job, i will dock you for it, even if there are offsetting factors that might otherwise mitigate it... -bowerbird p.s.? jose's ratio:? 4 signal, 3 noise. ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Tue Nov 18 06:39:49 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Tue, 18 Nov 2008 15:39:49 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <89AEACE9-FAF4-41A8-A9BB-0C0101E23BE8@uni-trier.de> Hi BB, All, I think I rest my case here. You admit that you need a njon-flawed text inorder to get your results. I assume the natural case of all text not been proffed yet, in any form. Also, as Jose stated there are factors which have side affects to the accuracy of the proofing such the size and quality of the dictionary used. I do admit I was impressed by your reply to me that you use heuristics an self programming technics to improve your program. regards Keith. Am 18.11.2008 um 01:33 schrieb Bowerbird at aol.com: > this is why i invite jose to check my work. he's got sharp eyes, > and he clearly takes great delight in finding mistakes i've made. > > nothing like a motivated enemy to keep you straight... :+) > > and jose, i'm gonna give you a "signal" judgment on this post, > but not without noting that you've been grossly dishonest here. > > and i'll explain why... > > i'm typically extremely clear on the _criterion_ that i am using, > in accordance with the nature of the research that i am doing... > > on this particular book, i was comparing the reposted p.g. e-text > with the uncorrected o.c.r. text obtained from the internet archive. > > in other words, i was comparing two flawed texts to each other, > to see if a comparison would elucidate problems between them. > > the comparison did just that, enabling me to locate several bugs > in the p.g. e-text. i also found problems with the o.c.a. version... > > that was the nature of the test. i described my text as "perfect" > using the _criterion_ of the text resulting from the merge of the > two different texts, both of which, it so happens, were flawed... > > thus, the merged text was also flawed. big deal. > > i did the comparison, and that's all. i didn't do any other checks > on the text after merging, or i would've found some of the errors. > > the point of that series was to make the point that comparison > is a methodology that al should have used to get better results. > nothing in your post, jose, provides counterargument to _that_. > > *** > > one weakness -- which i have always acknowledged explicitly -- > of the comparison methodology is a "blindness" when it comes to > any situation where there is an identical error in both of the texts. > > and i'm assuming that that's what happened here... > > i might have made some errors doing the comparison. it happens. > > but my guess is that the vast majority of the errors jose pointed out > in "my" version of the text are ones that exist in the p.g. e-text > too, > which is precisely why the comparison didn't isolate those errors... > > in fact, i'd be willing to bet money on it. i'll give you ten > bucks for > every one of your errors that is _not_ in the p.g. e-text, jose, if > you > will give me ten bucks for every one that _is_ in the p.g. e-text... > > so, do we have a bet? or not? no, i didn't think so. > > not so haughty once the _whole_ truth is out there, are you? > > *** > > dishonesty is never a good policy, jose... so i called you on > yours... > it's especially bad because i suspect you knew what you were doing, > and you chose to do it anyway. that's not a credit to your character. > > on the other hand, i'll also give you credit for proofing this e-text. > you caught a number of errors there, assuming that they are right > -- and we all know that i value your proofing abilities very highly -- > and that's quite good. i certainly will take advantage of it for > my text. > i suggest that the p.g. whitewashers take advantage for their text > too. > (indeed, al, i'd think you'd want to take a very close look at this > report, > since i would assume you _did_ run that text through all your checks.) > > so jose, that's why i'll give you credit for "signal" for this > post, not "noise". > > but the next time you come in with a smear job, i will dock you for > it, > even if there are offsetting factors that might otherwise mitigate > it... > > -bowerbird > > p.s. jose's ratio: 4 signal, 3 noise. > > > > ************** > Get the Moviefone Toolbar. Showtimes, theaters, movie news & more! > (http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol? > redir=http://toolbar.aol.com/moviefone/download.html? > ncid=emlcntusdown00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 18 10:25:52 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 18 Nov 2008 13:25:52 EST Subject: [gutvol-d] already doing it right there Message-ID: keith said: > I think I rest my case here. at the point where you need data, you rest your case? that's interesting. > You admit that you need a njon-flawed text > inorder to get your results. um, no. emphatically not. a bad misinterpretation. what i said is that _the_comparison_method_ has a blindness to the situation whenever the error in one of the texts is the same as in the other text... as i also said, i have _always_ acknowledged that. however, i have done empirical testing finding that two o.c.r. sets typically have few errors in common. and even some of _those_ are detected quite easily (i.e., where the mutual error won't pass spellcheck, or violates one of the other standard checks done). so it boils down to a vulnerability to stealth scannos. remember that, in the book jose looked at, we did _not_ have two o.c.r. sets. we had one o.c.r. set which was being compared with the p.g. e-text, and we have little information on its provenance. furthermore, i can't remember how i did the merge that created the end-product text jose was using. my best guess is that i used the p.g. e-text as the "base", and just made the corrections that i found as a result of doing the comparison. so i certainly don't want to extrapolate from that methodology... i didn't run any other checks on the resultant text. which is what the protocol calls for. indeed, as i pointed out in a recent post, the standard order is to do the cleaning _first_, and then a comparison. but i didn't do any pre-cleaning of the p.g. e-text. to tell you the truth, i presumed i didn't need to, because i assumed that al had run all his checks; either he failed to do that, or his tool has faults. and gee, after i'd found a dozen errors, i felt like i should lay off. al feels like i'm picking on him as is. so i'm glad that jose did the dirty work to dig up those ~30 _more_ errors in the p.g. e-text. jose's dishonesty had the post targeted at me, but it's actually more indictment of the repost process. (and, with this new data, the original 1998 e-text doesn't look nearly as good as i had described it, although -- at 322 pages -- it's still not too bad.) > Also, as Jose stated there are factors which have > side affects to the accuracy of the proofing > such the size and quality of the dictionary used. i must've missed a post, as i saw nothing like this. > I do admit I was impressed by your reply to me > that you use heuristics an self programming > technics to improve your program. i'm not sure i understand what that means either. *** so, i was curious about the large number of errors. plus i wondered if i'd made a colossal blunder with my "bet", and would lose my shirt at $10 per error. so i went through jose's list, one by one, to check. nope. i'm right. almost all were in the p.g. e-text. however, what was a bit surprising to me was that almost none of them were in the archive.org text... i'd assumed that these were errors-in-common, and that's why the comparison didn't reveal them. but i was wrong. the reason these errors persisted into the "merge" was because i made errors in the merging process. some of my mistakes were because i was being conservative in deciding that al made an error. (because, like i said, he's already oversensitive, so i need to make sure i'm right if i say "error".) but mostly, i just flubbed up. the text was correct in the archive.org version, incorrect in the p.g. one, and i just plain chose the wrong option. stupid me. what can i say? only this: to err is human. i'm human. but take heed, as there's a lesson to be learned here... the problem was not with the comparison methodology. the problem was with the human flipping the switches... so perhaps now, keith, you will let me rest _my_ case! -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Tue Nov 18 12:04:47 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Tue, 18 Nov 2008 21:04:47 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: Hi BB, All, Am 18.11.2008 um 19:25 schrieb Bowerbird at aol.com: > keith said: > > I do admit I was impressed by your reply to me > > that you use heuristics an self programming > > technics to improve your program. > > i'm not sure i understand what that means either. After heuristics that should be a "and" and not "an" BB had written at 17. November 2008 18:03:47 MEZ: > the programmatic rules -- derived from experimentation > that tests which versions of those rules give best results > -- are _what_ decides which of the choices we'd choose. > and > > as long as any rule leads to a _correct_decision_ more often > than not -- preferably _far_ more often than not -- then we > have a rule that we can use. further, the better its accuracy, > the more we use it. when it chooses the wrong alternative, > that just means we have an error we have to fix downstream. > it's not the end of the world. This way of doing things is heuristic. BB had written at 16. November 2008 22:32:30 MEZ: > most o.c.r. text can be cleaned up _extensively_ > (but not completely) via programmatic routines > that do _not_ need to be attended by a human... > > furthermore, the more books you correct this way, > the better your programmatic routines will become. > and google has lotsa books to improve itself with. O.K. I misinterprted your statement here. So, I will take back the self-programming back. Anyway, I do feel yours methods are good and make helpful tools and do feel that they, as you say, make for a more efficient workflow. regards Keith. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 18 12:52:52 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 18 Nov 2008 15:52:52 EST Subject: [gutvol-d] already doing it right there Message-ID: keith said: > This way of doing things is heuristic. i'd just call it "common sense"... ;+) > Anyway, I do feel yours methods are good > and make helpful tools and do feel that they, > as you say, make for a more efficient workflow. crap, keith, don't start _agreeing_ with me! that doesn't help me improve my thinking! disagree in a way that makes us both smarter! -bowerbird ************** Get the Moviefone Toolbar. Showtimes, theaters, movie news & more!(http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol?redir=htt p://toolbar.aol.com/moviefone/download.html?ncid=emlcntusdown00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Tue Nov 18 14:02:39 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Tue, 18 Nov 2008 23:02:39 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: Hey, I never disagreed about the value of your methods. O.K. Here is one way to improve your method or expand them: Automatically make a list of ALL words in the text and thier position. Then then check if they are similar if so check the variants against a dictionary to if they might be misspelled. If one of the variants show up as being misspelled correct it. If all variants are not in the dictionary have it check by a human. If they check out then add to the dictionary. another way to inprove this method and avoid unnecessary checks of words that are not in the dictionary is to use markov chaining analysis on the spelling. You could use a concordance to do different kinds of analysis or help with the above. This will help with idiosyncrasies of an author. Similarly, markov chains used on the word level can be used. Adding, a morphlogical component could be used. All of the above can be used in deciding what are errors and what is not. It is basically the inteligence you are looking for. Am 18.11.2008 um 21:52 schrieb Bowerbird at aol.com: > keith said: > > This way of doing things is heuristic. > > i'd just call it "common sense"... ;+) > > > > Anyway, I do feel yours methods are good > > and make helpful tools and do feel that they, > > as you say, make for a more efficient workflow. > > crap, keith, don't start _agreeing_ with me! > > that doesn't help me improve my thinking! > > disagree in a way that makes us both smarter! > > -bowerbird > > > > ************** > Get the Moviefone Toolbar. Showtimes, theaters, movie news & more! > (http://pr.atwola.com/promoclk/100000075x1212774565x1200812037/aol? > redir=http://toolbar.aol.com/moviefone/download.html? > ncid=emlcntusdown00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebooks at ibiblio.org Wed Nov 19 12:45:52 2008 From: ebooks at ibiblio.org (Jose Menendez) Date: Wed, 19 Nov 2008 15:45:52 -0500 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <49247B00.7040809@ibiblio.org> This reply from Bowerbird is a perfect example of why I don't reply to so many of his posts. Rather than honestly admitting his bungling, he resorted to dishonesty, evasions, lame excuses, etc. Tsk! Tsk! On Nov.17, 2008 Bowerbird wrote: > this is why i invite jose to check my work. he's got sharp eyes, > and he clearly takes great delight in finding mistakes i've made. I find it amusing how you try to pass yourself off as an expert on making ebooks, then show yourself to be an inept bungler. :) > nothing like a motivated enemy to keep you straight... :+) So how long have you been suffering from paranoia? ;) > and jose, i'm gonna give you a "signal" judgment on this post, > but not without noting that you've been grossly dishonest here. I haven't been dishonest here, grossly or otherwise. > and i'll explain why... Here come the evasions and lame excuses. > i'm typically extremely clear on the _criterion_ that i am using, > in accordance with the nature of the research that i am doing... > > on this particular book, i was comparing the reposted p.g. e-text > with the uncorrected o.c.r. text obtained from the internet archive. Could you point out the post in your "jean of the lazy a" series in which you were "extremely clear" that you were using "the uncorrected o.c.r. text" from the IA? > in other words, i was comparing two flawed texts to each other, > to see if a comparison would elucidate problems between them. > > the comparison did just that, enabling me to locate several bugs > in the p.g. e-text. i also found problems with the o.c.a. version... > > that was the nature of the test. i described my text as "perfect" > using the _criterion_ of the text resulting from the merge of the > two different texts, both of which, it so happens, were flawed... > > thus, the merged text was also flawed. big deal. Nonsense. I'll quote again the part of your post where you called your text perfect: > this version might still have some errors -- even a whole lot of 'em -- > involving missing single-quotes. the o.c.r. from archive.org missed > some, i know, and might have missed lots more. (i don't know if this > was because the o.c.r. never saw them, or if a glitch in their workflow > _lost_ them, but either way, the end-result ends up being the same.) > > i worked hard enough restoring the end-line-hyphens they had lost, > so i decided i wouldn't even bother to check the single-quote-marks, > since i can't think of a way to write a routine to automate that check... > but other than that, this text is _perfect_, as far as i am concerned... > so jose, have at it... you can blame me for any errors that you locate. The only remaining errors you allowed for were "involving missing single-quotes." "but other than that," you said, "this text is _perfect_, as far as i am concerned..." You didn't mention any possibility of other kinds of errors. Then you dared me to check your text and said I could blame you for any errors I found. > i did the comparison, and that's all. i didn't do any other checks > on the text after merging, or i would've found some of the errors. If you didn't do additional checks on the text before issuing that challenge to me, it doesn't say much for your common sense. :) > the point of that series was to make the point that comparison > is a methodology that al should have used to get better results. > nothing in your post, jose, provides counterargument to _that_. Where did I say I was trying to provide a counterargument against comparing texts? I clearly stated the point of my post right at the start: "Unfortunately, Bowerbird, your boasting is exceeded by your bungling." Neither comparison nor any other methodology will give very good results if you bungle the job, Bowerbird. And you *did* bungle it. > one weakness -- which i have always acknowledged explicitly -- > of the comparison methodology is a "blindness" when it comes to > any situation where there is an identical error in both of the texts. > > and i'm assuming that that's what happened here... Your assumption is wrong, as you admitted in a later reply to Keith Schultz. > i might have made some errors doing the comparison. it happens. You make errors?! With your expertise on efficient workflows, on marching texts to perfection? With those super tools you've programmed? Say it ain't so! ;) > but my guess is that the vast majority of the errors jose pointed out > in "my" version of the text are ones that exist in the p.g. e-text too, > which is precisely why the comparison didn't isolate those errors... No. The errors would have to have been in both the PG e-text *and* the IA's OCR file for the comparison not to isolate them. But most of them weren't in the IA's file as you admitted in that later reply to Keith Schultz. > in fact, i'd be willing to bet money on it. i'll give you ten bucks for > every one of your errors that is _not_ in the p.g. e-text, jose, if you > will give me ten bucks for every one that _is_ in the p.g. e-text... > > so, do we have a bet? or not? no, i didn't think so. You should have offered to give me ten bucks for every error that isn't in the IA's OCR file. You would have come out the worse because some of those errors weren't in either the PG or IA files. *You* introduced them into your file. For example, both the PG and IA files had the correct word "gabbling." You changed it to "gabbing" in your file. You mistakenly added the hyphens after "any" at the ends of these two lines: self, he probably did not suspect that there was any- and being careful to give no hint of that belief to any- Neither the PG or IA texts have the comma after the exclamation mark in this line: Lee Milligan was the drowning man!, and the agony of How you managed to add that comma is beyond me. But then you're the expert (self-proclaimed) on making ebooks. > not so haughty once the _whole_ truth is out there, are you? The only way "the _whole_ truth" would come out from you is unintentionally. > dishonesty is never a good policy, jose... so i called you on yours... > it's especially bad because i suspect you knew what you were doing, > and you chose to do it anyway. that's not a credit to your character. If "dishonesty is never a good policy," you shouldn't resort to it so often, Bowerbird. As for character, your response to my post reinforces the low opinion I have of your character. > on the other hand, i'll also give you credit for proofing this e-text. > you caught a number of errors there, assuming that they are right > -- and we all know that i value your proofing abilities very highly -- > and that's quite good. i certainly will take advantage of it for my text. > i suggest that the p.g. whitewashers take advantage for their text too. > (indeed, al, i'd think you'd want to take a very close look at this report, > since i would assume you _did_ run that text through all your checks.) I didn't proofread a single page of that e-text. I downloaded the IA's DJVU file of the book, extracted the page scans, then used my usual image enhancement/OCR techniques to get extremely accurate OCR. Then I simply compared it to your file. You see, Bowerbird, I know how to do a file comparison without bungling the job. :) > so jose, that's why i'll give you credit for "signal" for this post, not > "noise". > > but the next time you come in with a smear job, i will dock you for it, > even if there are offsetting factors that might otherwise mitigate it... In your "jean of the lazy a" post, you said: "so jose, have at it... you can blame me for any errors that you locate." Now, when I take you up on your challenge, provide a lengthy list of errors left in your file, and ascribe them to the correct cause, i.e. your bungling, you whine that it's "a smear job." You truly are pathetic. Jose Menendez From Bowerbird at aol.com Wed Nov 19 15:50:05 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 19 Nov 2008 18:50:05 EST Subject: [gutvol-d] already doing it right there Message-ID: i sure hope people are being entertained by all this... *** jose said: > This reply from Bowerbird is a perfect example > of why I don't reply to so many of his posts. > Rather than honestly admitting his bungling, he > resorted to dishonesty, evasions, lame excuses, etc. and this is why i _do_ reply to many of jose's posts -- because the antidote to his dishonesty is _sunshine_. jose puts his spin out, i correct it, and truth is served. > I find it amusing how you try to pass yourself off > as an expert on making ebooks, then show yourself > to be an inept bungler. :) look, that has a smiley on it. how cute. :+) > So how long have you been suffering from paranoia? ;) a winky smiley. even cuter! ;+) > I haven't been dishonest here, grossly or otherwise. yes you were. and i explained why... > Here come the evasions and lame excuses. ...and you called that explanation "evasions and lame excuses". my description and yours are far enough apart that i can trust the readers to come to their own decisions on which is correct. > Could you point out the post in your "jean of the lazy a" series > in which you were "extremely clear" that you were using > "the uncorrected o.c.r. text" from the IA? of course i can. i explained it _in_detail_ -- with a link -- in the first post. and every post after the first one started with this line: > we're resolving the p.g./archive.org differences in "jean of the lazy a". very first line. every post. what did you think i was comparing? *** > Then you dared me to check your text > and said I could blame you for any errors I found. and you can indeed blame me for all of the merge errors that came out of the comparison methodology, and i've said in my last past that i did indeed make such errors... but the errors that were unrelated to the comparison? those were of no consequence to me. or to my point. my point was that al could have uncovered many errors by doing a straightforward comparison. i made my point. i proved its validity by doing the comparison, and finding lots of errors, which i listed in detail, thankyouverymuch. you found more errors, ones which were not related to the value of the comparison methodology. good for you. but it doesn't have any bearing on the point i was making. the only relevance it _might_ have is that -- when a human gets involved, you can expect the human to make mistakes. which _also_ backs up my point, although i'm not sure that people would give any credence to an argument that i could "provide evidence for" simply by making a lot of mistakes... "humans make mistakes; i can prove it by making mistakes." > If you didn't do additional checks on the text > before issuing that challenge to me, > it doesn't say much for your common sense. :) oh, but it _does_, jose, it surely does. because one of the steps in my workflow is to goad you into checking _every_ e-text i work on, because you have very sharp eyes. forget those thousands of proofers over at d.p.-land, they can't compare to jose, the super-proofer! > Where did I say I was trying to provide a counterargument > against comparing texts? I clearly stated the point of my post > right at the start: "Unfortunately, Bowerbird, your boasting > is exceeded by your bungling." Neither comparison nor > any other methodology will give very good results if you > bungle the job, Bowerbird. And you *did* bungle it. no, i made my point quite clearly. and correctly. which is exactly why you're not trying to provide a counterargument. instead, you're trying to drag some other argument into the fray -- one of no importance to me -- and then counter _that_ one. > You make errors?! With your expertise on efficient workflows, > on marching texts to perfection? With those super tools > you've programmed? Say it ain't so! ;) yes, even with all that, i still make errors. see? nobody's perfect. not even you, jose. you still haven't proofed all of the books that google has scanned, so you better get to work. when you're done, _then_ you'll be perfect. > The errors would have to have been in both the PG e-text > *and* the IA's OCR file for the comparison not to isolate them. > But most of them weren't in the IA's file as you admitted > in that later reply to Keith Schultz. yeah. i made mistakes. and when i "admitted" that most of the errors were not in the archive.org text, i said i'd made mistakes doing the merge during the comparison. what else could i say? my mistakes were clearly documented. so yes, i made mistakes. have i said it enough times yet? i made mistakes. i flubbed up. i made errors. lots! bugs... mistakes... beaucoup flubberoos... yet, in the face of all that incompetence, i _still_ managed to create a convincing argument for the point i set out to prove. pretty darn amazing to pull it off in spite of all my "bungling". (and -- for anyone who cares about the _real_ issues here -- the errors i made during this experiment haven't surprised me, because i had to do a lot of reworking of this text to get it into shape to _do_ the comparison. one thing was that linebreaks between the two files were different. another thing was that the archive.org text was missing em-dashes. things like that. my particular tools are line-based, which is fine when you are comparing two scan-sets of the same book, because the lines match up just fine. but i have to do a lot of prep work before i can compare a scan-set text against a rewrapped p.g. e-text. it also means that i have to "ignore" a bunch of differences that are meaningless -- e.g., one that reflects a missing em-dash -- and in the process of ignoring some, i inevitably missed others. big deal. i still found more than enough bugs to make my point. the take-away lesson of all this is "don't rewrap the linebreaks". but we all know jose isn't interested in the actual _work_ here, he just wants to throw rocks at me... so let's get back to that.) > You would have come out the worse because > some of those errors weren't in either the PG or IA files. > *You* introduced them into your file. i introduced my own edits, yes. i always do. i always will. things like dropping the period after "per cent.", or even -- horror of horrors -- _changing_ it to a comma instead. i also dehyphenate "to-day" and "to-morrow", because that is what we do these days. the old variants are so yesterday. maybe somebody counts those edits of mine as "errors". fine. maybe _you_ count them as errors, jose. fine. everybody can make their own judgments, i'm cool with that. public-domain means that text belongs to _you_, so do with it what you want. it means that text belongs to _me_. so i do with it what i want. but yeah, that "be looked" that shoulda been "he looked", well, that is clearly an error. and you'll know that i considered that to be an error when you see that i change it in my next version. but i won't be changing the _edits_ that i made. so you can tell. > For example, both the PG and IA files had the correct word > "gabbling." You changed it to "gabbing" in your file. you neglected to tell people that that was an end-line hyphenate. in resolving that -- dehyphenating it so as match the p.g. e-text, since the p.g. e-texts have removed the end-line hyphenates -- i made an editorial decision to change it to "gabbing". i obviously didn't check the dictionary, or i would have found that "gabbling" is indeed a valid word. so that's an edit i now consider a mistake; thus i will undo it on my next version. thanks for the bug report. by the way, here's the dictionary i use: > http://z-m-l.com/go/regulardictionary.txt and yes, it contains "gabbling"... > You mistakenly added the hyphens after "any" > at the ends of these two lines: i can't remember if i added those hyphens myself, manually, or if my automatic hyphenation routines added them. either way, i'll look at 'em again, specifically and closely, and make a decision. (back to a real issue, just for a second. many of the archive.org texts are missing end-line hyphens, so you have to restore them. my results indicate that the best routine is to insert such a hyphen if the last chunk of letters at the end of a line and the first chunk of letters in the next line create a chunk that's in the dictionary. this routine is "the best", but it _does_ give some bad results, and the biggest case of these bad results is with words that start with "any" and "every" and "some". other word-pairs also are problems, but the any/every/some words are common and need verification. this experiment wasn't crucial enough to require such verification. but perhaps this'll be useful to the next person making such a tool, since this is a job that needs to be done on many archive.org texts. we now return you to the drama of the jose/bowerbird squabble...) > Neither the PG or IA texts have the comma > after the exclamation mark in this line: > > Lee Milligan was the drowning man!, and the agony of > How you managed to add that comma is beyond me. well, why don't you just ask me? because i can explain it. one of the general edits i do these days -- to see how it works, so as to decide if i'll make it a policy extending into the future -- is to introduce some kind of punctuation (most typically a comma) after any mid-sentence exclamation marks. as an alternative mode, i will convert the letter immediately following the exclamation mark to uppercase, thereby creating two sentences from the previous one. i'm trying out these edits so that my e-texts can be subjected to a general check that all exclamation marks are followed by _either_ a punctuation mark (meaning that they had occurred mid-sentence) _or_ a capitalized word (meaning they were sentence-terminating). given that check, the p-book text would've triggered a false alarm... part of what i am attempting to do is to eliminate such false alarms. (another is to introduce library-wide consistency of a house-style.) since it's a general policy, so you'll see it a lot in the books i do, and not just exclamation marks, but question marks too... ok? why, do you feel that adding that comma violates the author's intent? > But then you're the expert (self-proclaimed) on making ebooks. where did i self-proclaim that? i'd rather scoff at "experts" than be one. i'll tell you what i _do_ proclaim about myself: i do a lot of research. and i share the results of that research with people on this listserve. > I didn't proofread a single page of that e-text. > I downloaded the IA's DJVU file of the book, > extracted the page scans, then used my usual > image enhancement/OCR techniques > to get extremely accurate OCR. you know, jose, it would be very helpful to a lot of people here if you shared some detailed information about your workflow... but you don't do that, do you? you know how to get stunningly accurate o.c.r. out of scans -- and that's information a lot of people here really want, or info they'd want if they didn't think it was "impossible" -- but you don't contribute that information, you just sit on it... and you've been sitting on it for _years_. why is that, jose? so, folks, i'll tell you how jose does it. he accidentally spilled enough of the beans one time that i could piece it together... he quickly realized he shouldn't have let the cat out of the bag, and thus refused to verify a procedure when i asked for details, so i might have some of this wrong, but you should be able to get a lot of mileage out of this general approach if you test it... first, jose uses a very old version of _textbridge_. that package doesn't have the best reputation, and it is no longer supported, but pay attention to the fact that jose gets super results from it. the lesson might be that the program means less than the scan. what he calls "my usual image enhancement/o.c.r. techniques" consists of changing the resolution of a scan through a range, and re-doing the o.c.r. at each resolution to assess the accuracy. he says that it's typically very easy to see which resolution gives the best o.c.r. when the resolution is too low, the o.c.r. is awful. and when it's too high, it's also awful. in the middle, it's great... his surprising finding is that the best o.c.r. often comes at 200dpi. the naive expectation of people is that higher resolution is better. but they don't even do these resolution tests to see if that is true. jose says that if you do this test on a couple pages from a book -- that's all it takes -- you will converge on the right resolution. jose has often reported that his o.c.r. output is "almost perfect". now, the thing is that jose mentioned _only_ the _resolution_... digging deeper -- asking about things like brightness/contrast, deskewing, cropping, color/greyscale/black&white, despeckling -- made jose clam up. but at the outset, it was only resolution... i'd be perplexed if despeckling didn't give similar results, since it would seem you could run the gamut from heavy despeckling that removed all the punctuation, to no despeckling at all, where excess dots often get misrecognized as extraneous punctuation. but jose didn't mention despeckling. it was only _resolution_... maybe jose's findings are unique to his scanner, or to textbridge, or something else idiosyncratic to jose. but whatever it might be, jose seems to have found the secret to extremely accurate o.c.r. but he doesn't seem to want to tell you how to get it. indeed, if i hadn't pulled him out of his shell to attack me, he wouldn't even have revealed to us that it's a possibility to get "almost perfect" results. he doesn't want us to know. i wonder why... at any rate, i suspect it's not hard to get great o.c.r. output, not if you follow up on jose's findings and do some research. i''d do some experiments myself, but i don't have a p.c. here, and there's simply no decent o.c.r. software on the mac side, plus i don't know the first thing about manipulating graphics, and i decided to leverage o.c.a. o.c.r. instead of doing it myself -- yes, it's flawed, but at least it's flawed in a consistent way -- so someone else will have to do this exploration. sorry charlie. but the secret is that it's possible to get "almost perfect" o.c.r. and it sure would be nice if jose just explained it all to us... oh, yeah, and in case you didn't notice, jose has confirmed here that once he has the "almost perfect" o.c.r., he uses _comparison_ to find errors in the other text. so, to put this slightly differently, jose is supporting my argument on the value of using comparison. kind of amazing that someone supporting my side of the argument phrases that support in a way that looks like he's on the other side, isn't it? that's why i describe jose as being "grossly dishonest" here. > You truly are pathetic. my, my, isn't the level of discourse high? oh well, as long as the lurkers are being entertained... -bowerbird p.s. jose's record: 4 noise, 4 signal. ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Thu Nov 20 13:53:41 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Thu, 20 Nov 2008 22:53:41 +0100 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <9D62B16C-6A3A-4844-8F68-D3606DA21881@uni-trier.de> O.K. BB and Jose, Stop the kids play. Whether either if you two believe me or not : It can be mathematically proven that 1+1 = 5. Before you start to challenge me for the proof of the above stated hypothesis, you have to give proof that 1+ 1 = 2, is factual. As another side thought actually 1 and 1 is 11. I hope you get what I am getting at. regards to the kindergarden. Keith. From ebooks at ibiblio.org Thu Nov 20 19:59:17 2008 From: ebooks at ibiblio.org (Jose Menendez) Date: Thu, 20 Nov 2008 22:59:17 -0500 Subject: [gutvol-d] already doing it right there In-Reply-To: References: Message-ID: <49263215.4020706@ibiblio.org> On Nov. 19, 2008, Bowerbird wrote: > i sure hope people are being entertained by all this... I doubt you really care as long as you're getting attention. :) > jose said: > > This reply from Bowerbird is a perfect example > > of why I don't reply to so many of his posts. > > Rather than honestly admitting his bungling, he > > resorted to dishonesty, evasions, lame excuses, etc. > > and this is why i _do_ reply to many of jose's posts -- > because the antidote to his dishonesty is _sunshine_. > > jose puts his spin out, i correct it, and truth is served. I have to give Bowerbird credit. Not only is he dishonest, he's brazenly dishonest. Unfortunately for him, in his reply he provided clear, unmistakable evidence of his dishonesty: > you know, jose, it would be very helpful to a lot of people here > if you shared some detailed information about your workflow... > > but you don't do that, do you? > > you know how to get stunningly accurate o.c.r. out of scans > -- and that's information a lot of people here really want, > or info they'd want if they didn't think it was "impossible" -- > but you don't contribute that information, you just sit on it... > > and you've been sitting on it for _years_. why is that, jose? Why don't you tell us why you're lying, Bowerbird? I described in detail how I enhance images for OCR on the DP forums 3 years ago. Back on Sept. 30, 2005, I posted this message to the Book People mailing list, comparing the ebook I made from Google's scans of "Books and Culture" to DP's version of the same book: http://onlinebooks.library.upenn.edu/webbin/bparchive?year=2005&post=2005-09-30,3 (I know that a lot of you are probably familiar with that post either from being past members of the BP list or because Bowerbird has linked to it several times in the past.) In that post, I noted several errors in DP's ebook including a missing page. Some days later, a DP volunteer told me in a private email that there was a discussion about my post on DP's forums. So I registered at DP and joined in. On Oct. 20, 2005, I posted this message to Jim Tinsley on the DP forums: http://www.pgdp.net/phpBB2/viewtopic.php?p=165068#165068 Here are a few excerpts from it: "By the way, Jim, since you were so interested in comparing the results obtained with different OCR packages, you might be interested to know that I used a 6-year old version of TextBridge Pro 8.0 to make my version of _Books and Culture_ and got excellent results. Many pages came out perfectly, down to the smallest punctuation mark.... "Now, if you're familiar with the old TextBridge Pro 8.0, you may recall that it wouldn't OCR JPG images. So Google's page scans weren't of any use to me as-is. I had to convert them to TIFs. I also had to enhance them to get the best OCR results, enlarging the original images and changing their resolution. (Interestingly enough, increasing the resolution past 200 DPI made the OCR much worse in that case.) Once I found the best combination by experimenting on a couple of images, I quickly batch converted the rest of them...." On Oct. 26, 2005, I posted this reply to Jim: http://www.pgdp.net/phpBB2/viewtopic.php?p=166250#166250 Excerpt: "As for improving the images with automation, what I did is experiment manually with just a few images. Once I found the size and resolution that yielded the best OCR, I batch converted all the remaining Google images in seconds--resizing them, changing the resolution, and converting them to TIFFs. Oh, I also had the batch conversion crop off the 'Google Print' running down the sides of the images, because the letters caused errors in the OCR...." That same day, Bruce Albrecht posted this message: http://www.pgdp.net/phpBB2/viewtopic.php?p=166266#166266 Excerpt: "Jose, I've been wondering what you've meant by 'enlarging and enhancing' the images? ..." I posted a lengthy, detailed reply the same day: http://www.pgdp.net/phpBB2/viewtopic.php?p=166386#166386 Excerpts: "It's not too complicated. First, if you use a Windows OS, I'd recommend you download Irfanview, the program murraypaul mentioned. It's small, very fast, and amazingly powerful for a free program. You can get it and see all the features here: http://www.irfanview.com/ ... "Now, here's what I did with the Google JPEGs for _Books and Culture_.... "As I mentioned in an earlier post, my old OCR software doesn't work with JPEG images, so I had to convert them to TIFFs. So the first thing I did is pick an image of average quality and saved it as a TIFF. (With Irfanview it's very simple. Just click on 'Save as' from the 'File' menu. You'll get a dialogue box where you can choose from quite a few graphic file types. In the case of TIFFs, there are different compression options. I used Huffman RLE, which saved the images as black and white.) When I OCRed it, the results were pretty bad. So I knew I needed to experiment with enlarging the image and increasing its resolution. "To do that with Irfanview, you'd click on 'Resize/Resample' under the 'Image' menu. You'll get a dialog box, and the first thing to do is make sure it's set to use the resample size method for better quality. There's a choice of filters, too. I got the best results using the Lanczos filter. "In the same dialog box, you can set the resolution (DPI) and change the image dimensions. In my first test, starting with the original JPEG, I cropped the image to remove the 'Google Print' and some of the excess white space. Then I set the resolution to 200 DPI and 150% for the width and height...." I went on to mention some of the other combinations I tried, "300 DPI and 200%," "200 DPI and 200%," etc. And near the end of my post, I wrote: "If you have any questions, feel free to ask." But neither Bruce nor any other DP member asked for more details about my technique. > so, folks, i'll tell you how jose does it. he accidentally spilled > enough of the beans one time that i could piece it together... > he quickly realized he shouldn't have let the cat out of the bag, > and thus refused to verify a procedure when i asked for details, > so i might have some of this wrong, but you should be able to > get a lot of mileage out of this general approach if you test it... "Accidentally spilled enough of the beans"? No, I deliberately brought up my technique on DP's forums to help them out, and when Bruce asked me for more details, I gave him a lot more details. *You* weren't even part of the discussion, Bowerbird. > first, jose uses a very old version of _textbridge_. that package > doesn't have the best reputation, and it is no longer supported, > but pay attention to the fact that jose gets super results from it. > the lesson might be that the program means less than the scan. No. As I've always said, I use TextBridge Pro 8.0. Note the "Pro." There is a huge difference between plain TextBridge and the "Pro" version. > what he calls "my usual image enhancement/o.c.r. techniques" > consists of changing the resolution of a scan through a range, > and re-doing the o.c.r. at each resolution to assess the accuracy. [snip] > now, the thing is that jose mentioned _only_ the _resolution_... Another lie, Bowerbird. As I wrote in that post back on Oct. 20, 2005, my enhancement technique involved "enlarging the original images and changing their resolution." I never said it was only the resolution. > digging deeper -- asking about things like brightness/contrast, > deskewing, cropping, color/greyscale/black&white, despeckling > -- made jose clam up. but at the outset, it was only resolution... More lies. At the outset, I clearly said, "enlarging the original images and changing their resolution." And when Bruce asked me about it, I didn't "clam up." As I quoted above, I told him, "I cropped the image to remove the 'Google Print' and some of the excess white space." I also told him, "I used Huffman RLE, which saved the images as black and white." I didn't mention "brightness/contrast," "deskewing," or "despeckling" because I was describing what I actually *did* on Google's page scans. With my slow typing speed, I certainly wasn't going to waste time talking about things I *didn't* do. Now, perhaps you'll claim that *you* questioned me privately about "deskewing," "despeckling," etc. If you do claim that, please provide the exact date(s) of your email(s) because I can't find one where you asked me about "deskewing," "despeckling," etc. > i'd be perplexed if despeckling didn't give similar results, since > it would seem you could run the gamut from heavy despeckling > that removed all the punctuation, to no despeckling at all, where > excess dots often get misrecognized as extraneous punctuation. > > but jose didn't mention despeckling. it was only _resolution_... Repeating a lie over and over doesn't make it true, Bowerbird. As I've shown conclusively, I didn't say it was only resolution. > maybe jose's findings are unique to his scanner, or to textbridge, > or something else idiosyncratic to jose. but whatever it might be, > jose seems to have found the secret to extremely accurate o.c.r. > > but he doesn't seem to want to tell you how to get it. So I told them on the DP forums 3 years ago because I didn't want to tell them? Uh huh. > indeed, if i hadn't pulled him out of his shell to attack me, > he wouldn't even have revealed to us that it's a possibility > to get "almost perfect" results. he doesn't want us to know. Once again, here's what I wrote in my DP post (addressed to Jim Tinsley) on Oct. 20, 2005: "By the way, Jim, since you were so interested in comparing the results obtained with different OCR packages, you might be interested to know that I used a 6-year old version of TextBridge Pro 8.0 to make my version of _Books and Culture_ and got excellent results. Many pages came out perfectly, down to the smallest punctuation mark...." Could you point out where I attacked you in that post? Could you point out where I referred to you at all in that discussion? > i wonder why... I wonder why you lie so much, Bowerbird. :) Jose Menendez From Bowerbird at aol.com Thu Nov 20 20:45:59 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 20 Nov 2008 23:45:59 EST Subject: [gutvol-d] already doing it right there Message-ID: jose seems determined to insult me every way that he can. good thing i have a thick skin... *** ok, well, i hope that elaboration by jose helps people out. he's told us how he gets fantastic results from his o.c.r. he's prone to say less than he knows, so i wonder if he told us _all_ of his secrets, but you've got a good start. in private e-mail following his post on the d.p. forum, jose clammed up. or maybe, as he implies, he never got my e-mail. it happens. but if _you_ ask him for details on this public listserve, perhaps he will be forthcoming. so there you go, folks. i'd suggest you follow up on that. many people in the past seem to have dropped the ball... but it's information that digitizers absolutely should have, in particular, it will save a lot of volunteer time over at d.p. so i urge those of you who do o.c.r. to pursue this research, because jose will just sit on it for more years if you let him... unless i decide to goad it out of him again... :+) (and if you guys don't follow up, i _will_ goad him again.) so, if you do o.c.r., and your results are less than perfect, there's something you can learn here about doing it better. -bowerbird p.s. oh and, just as one more reminder, once he's got his "almost perfect" o.c.r., jose uses the comparison method. so you have yet more proof of the value of that method... ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Thu Nov 20 20:53:53 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 20 Nov 2008 23:53:53 EST Subject: [gutvol-d] already doing it right there Message-ID: i said: > (and if you guys don't follow up, i _will_ goad him again.) and this wasn't the first time i've bumped that 2005 d.p. thread. i bumped it over a year after it sputtered, in december of 2006: > http://www.pgdp.net/phpBB2/viewtopic.php?p=268884#268884 now i'm here -- some two years later -- bumping it once again. you see, i might be a bungler, but i am a _persistent_ bungler... ;+) -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 21 02:03:30 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 21 Nov 2008 05:03:30 EST Subject: [gutvol-d] trail of the white mule -- 001 Message-ID: well, jose has shown me that i need to do more work on getting my comparison tools whipped into shape, so i'll be looking at a few more p.g. "reposted" e-texts. :+) next up is "trail of the white mule", reposted wednesday. > http://www.gutenberg.org/files/2063/2063.txt someone uploaded the google scan-set to archive.org, who did o.c.r. on it. neat. we're sidestepping google's unwillingness to give us the text in a convenient format, routing around the damage caused by the selfish entity! > http://www.archive.org/details/trailwhitemule00bowegoog *** one of the first checks, you'll remember, is paragraphing. this one was pretty good; i found only 1 paragraph error. namely, the paragraph that starts with this line: > The old woman dropped her hands to her was accidentally merged into the paragraph above it. > http://z-m-l.com/go/wmule/wmulep083.jpg more on this book later... -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Fri Nov 21 03:11:33 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Fri, 21 Nov 2008 12:11:33 +0100 Subject: [gutvol-d] trail of the white mule -- 001 In-Reply-To: References: Message-ID: <65B6678E-48E5-4A7A-87E0-B8B522D97798@uni-trier.de> C'mon Guys, Do we have to go over this AGAIN! It should be well know that by now the quality of OCR is highly depenadnt on the scan resolution and software used. This does not mean the higher resolution the better. Other factors are contrast, noise in the scan and quality of the orginal. These FACTS have been known since scanners where frist made. Furthermore, the scans can be enhanced by other software and increasing the quality of the OCR. Also, with the help of dictionaries the quality can be increased. Other methods can help, to as have been discuss here! We can agree that there is always a better or new way that will be more efficient. I believe it is up to each individual or instituation to decide to use them or not once presented and thier cavets explained. Far as efficient and costs are concerned the best OCR machine is a human. That is why many companies and libraries send there works or scans to China and "OCRed" by an army of workers! (If I remeber correctly this also has been touched here!!) So why not continue your work and find better ways to do things and stop worring about you egos or huirt feelings. As the saying goes "Sticks and stones may break my bones but [scans] may never hurt me. regards Keith. From 1001 at atlanticbb.net Fri Nov 21 06:46:32 2008 From: 1001 at atlanticbb.net (1001 at atlanticbb.net) Date: Fri, 21 Nov 2008 09:46:32 -0500 Subject: [gutvol-d] trail of the white mule -- 001 References: Message-ID: <005a01c94be7$f650f880$680fa8c0@atlanticbb.net> Volunteers around the world are uploading google texts to the Internet Archive. Google frowns on persistent downloadeers by blocking their IP address, so unlimited access is available on the internet Archive. Asone who works on the novels of Jules Verne I find that most of these scanned by google have been reposted. The only problem seems too be there is no way to post an item from another pd website without downloading it and then uploading it again nwolcott2 at post.harvard.edu ----- Original Message ----- From: Bowerbird at aol.com To: gutvol-d at lists.pglaf.org ; Bowerbird at aol.com Sent: Friday, November 21, 2008 5:03 AM Subject: [gutvol-d] trail of the white mule -- 001 well, jose has shown me that i need to do more work on getting my comparison tools whipped into shape, so i'll be looking at a few more p.g. "reposted" e-texts. :+) next up is "trail of the white mule", reposted wednesday. > http://www.gutenberg.org/files/2063/2063.txt someone uploaded the google scan-set to archive.org, who did o.c.r. on it. neat. we're sidestepping google's unwillingness to give us the text in a convenient format, routing around the damage caused by the selfish entity! > http://www.archive.org/details/trailwhitemule00bowegoog *** one of the first checks, you'll remember, is paragraphing. this one was pretty good; i found only 1 paragraph error. namely, the paragraph that starts with this line: > The old woman dropped her hands to her was accidentally merged into the paragraph above it. > http://z-m-l.com/go/wmule/wmulep083.jpg more on this book later... -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp%26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) ------------------------------------------------------------------------------ _______________________________________________ gutvol-d mailing list gutvol-d at lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 21 08:08:12 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 21 Nov 2008 11:08:12 EST Subject: [gutvol-d] trail of the white mule -- 001 Message-ID: keith said: > Do we have to go over this AGAIN! well, yeah, as a matter of fact, we do... because the first time around -- in 2005 -- the people in charge (in the form of tinsley, not to mention the powers that be at d.p.) missed the boat, even after initial interest. and the second time around -- in 2006 -- the people in charge didn't even get the hint. thus now, _two_years_later_ -- in 2008 -- i'm giving it yet another shot at recognition. jose is aware of some things that should be of value to the people at distributed proofreaders, and i'm just tryin' to get 'em to notice that fact... because they're still wasting the time and energy of their proofers by sending them inferior o.c.r., and there's a moral question on the ethics of that. i surely do not think it is "excessive" to bring that to the attention of those people every year or two. -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 21 08:20:10 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 21 Nov 2008 11:20:10 EST Subject: [gutvol-d] trail of the white mule -- 001 Message-ID: norm said: > Volunteers around the world are uploading google texts to the Internet Archive. all along, i've been saying that people should do this. so i'm glad that they are. i had noticed it out of the corner of my eye before, but this time i saw it clearly... > The only problem seems too be there is no way to post an item from another > pd website without downloading it and then uploading it again i'm guessing you can get archive.org to do a sweetheart arrangement with ibiblio, since they are currently grabbing the e-texts from project gutenberg from there... the reason _that_ might be of interest is that i once did the programming so that i could save google's scans directly to the "snowy" account greg newby gave me... from snowy, they could travel over the backbone to ibiblio, then jet to archive.org. or -- even better -- perhaps archive.org and snowy could hook up directly? the code is pretty simple, as you just scrape some google .html to find the u.r.l. of the book itself, and then grab it from there. could save people a lot of unnecessary dancing around with the downloading (from google) and uploading (to archive.org). -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 21 09:48:17 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 21 Nov 2008 12:48:17 EST Subject: [gutvol-d] trail of the white mule -- 001 Message-ID: keith said: > another way to inprove this method and avoid > unnecessary checks of words that are not in the dictionary > is to use markov chaining analysis on the spelling. > > You could use a concordance to do different kinds > of analysis or help with the above. This will help with > idiosyncrasies of an author. > > Similarly, markov chains used on the word level can be used. > > Adding, a morphlogical component could be used. that's pretty good for an off-the-top response, keith. at the same time, it's out-of-touch with the _reality_ of the relatively crudish nature of the task-at-hand. we don't need those sophisticated types of analysis; nope, we need something that's a _lot_ more simple. for instance, let's consider the "paragraphing" data that i just collected from the "white mule" reposting. what i do with that is to compare the first lines from the paragraphs in both versions, and i see if/where there are discrepancies. for this task, however, i'm looking for a wild mismatch, where one line is _completely_ different from the other. since we're looking for cases where a paragraph was inadvertently split, or incorrectly merged with another. so mismatches that result from simple o.c.r. differences are to be ignored in this task. i've appended the data from this comparison for you, because it illustrates the arena in which we're working. with one exception (noted later), these lines "match up". so _these_cases_ are what o.c.r. errors look like today... these are the types of corrections that we have to make. if you look at those errors, you'll see they're very crude. i sincerely doubt that markov chaining will help much, or any "sophisticated" technique. for these changes, you need to employ a big stick, not a refined analysis... oh, and i already employed one such "big stick" here. there were lots and lots of cases with a double-quote at the line-start that were misrecognized as asterisks. indeed, there were 257 such misreads as "** ". _257_. easy to say "do a blind global change on those babies", and be quite confident there won't be any bad changes. i am sure if you peruse the appended lines, you will find many more global changes that would also be beneficial. more importantly, you will grasp the _reality_ of our task. *** also, i now see one more place where there might be a paragraphing problem in the project gutenberg repost, namely paragraph 348, which it appears is a mismatch. yep, the p.g. repost has a paragraphing error here: > Through his slits of swollen lids Casey glared > http://z-m-l.com/go/wmule/wmulep110.jpg > http://www.gutenberg.org/files/2063/2063.txt so that's _2_ paragraphing errors on this reposting... *** by the way, whenever you look at differences like this, you're bound to notice the occasional error by chance. such was the case here. if you look at paragraph 942, you'll see this: > 942 "Naw. This ain't no trouble," he grunted > 942 "Naw. This ain't no trouble," he granted the top line -- from the o.c.r. -- looks right (grunted), while the bottom line -- from the p.g. reposting -- looks questionable (granted). sure enough, the scan says "grunted". so it looks like we need to start a list... -bowerbird 4 "This'U mean thirty days for you," splut 4 "This'll mean thirty days for you," splu 5 "Get the undertaker on the line first 1 5 "Get the undertaker on the line first!" 9 ""Why do you persist in making trouble fo 9 "Why do you persist in making trouble for 11 "All right, ma'am. You can drive, then.* 11 "All right, ma'am. You can drive, then." 13 "You was in a hurry to git home, ' * Case 13 "You was in a hurry to git home," Casey p 15 "Police be danmed! I'm tryin* to please 15 "Police be damned! I'm tryin' to please 16 "Heyl "Whadda yuh mean, blockin' th 16 "Hey! Whadda yuh mean, blockin' the 20 "It's coining to a show-down, Jack,'* sh 20 "It's coming to a show-down, Jack," she 22 "It sounds much better, putting it that way, ' 22 "It sounds much better, putting it that way," m 23 "It isn't spoiled,'* I grinned. "Case 23 "It isn't spoiled," I grinned. "Casey 29 "Well-sir, ' * she drawled, making one wor 29 "Well-sir," she drawled, making one word o 32 "He d??d, yes. But his idea of luxury is sit 32 "He did, yes. But his idea of luxury is sitt 34 "And you know, * ' she went on, shuffling th 34 "And you know," she went on, shuffling the c 36 "He did one good stroke of business,'* I ven 36 "He did one good stroke of business," I vent 48 "Well-sir, he 's gone I * * she announced, an 48 "Well-sir, he's gone!" she announced, and sto 57 He was fortunate in buying a demonstrator * 57 He was fortunate in buying a demonstrator's 61 He was teUing himself that he didn't car 61 He was telling himself that he didn't ca 62 He climbed stiflSy out, squinted at the sk 62 He climbed stiffly out, squinted at the sk 63 "If they wanta come pinch me here, 1*11 mee 63 "If they wanta come pinch me here, I'll mee 65 "Hell! " said Casey, breathing deep when 65 "Hell!" said Casey, breathing deep when, 84 "Whereas that there Joshuay tree pointin? 84 "Where's that there Joshuay tree pointin' 96 "Somebody ??s gunnin* fer us, looks like t? 96 "Somebody's gunnin' fer us, looks like t' m 98 "So'd I,'' boasted Barney, "but that ain?? 98 "So'd I," boasted Barney, "but that ain't 99 "What I'm figurin' out now,'' said Casey 99 "What I'm figurin' out now," said Casey, 103 "What yuh figurin' on doin'T " Case 103 "What yuh figurin' on doin'?" Casey 110 Bamey, therefore, dug like a badger with 110 Barney, therefore, dug like a badger with 112 Standing there pufling and wondering what t 112 Standing there puffing and wondering what t 126 So here was what the boulder concealed, - 126 So here was what the boulder concealed, 134 Away from the cabin a pebble *s throw, h 134 Away from the cabin a pebble's throw, he 145 Against the limitations prescribed by his ma 145 Against the limitations proscribed by his ma 157 "Git up off 'n him! " a new voice commande 157 "Git up off'n him!" a new voice commanded 160 "Paw, you oP fool, you, get your finge 160 "Paw, you ol' fool, you, get your fing 162 "* Aw, you shut up, Paw. You ain 't gittin 162 "Aw, you shut up, Paw. You ain't gittin' no 166 "I'd fix 'im here an' now," threatened Paw 166 "I'd fix 'im, here an', now," threatened P 171 "Ever drill in rock? '?? he asked shor 171 "Ever drill in rock?" he asked shortly 172 "Mebbe I have an' mebbe I ain't,'' Case 172 "Mebbe I have an' mebbe I ain't," Casey 173 "Here's a drill, an' here's your single- jack 173 "Here's a drill, an' here's your single-jack. 177 "Two of us waitin * to see your boss, huh ? ' 177 "Two of us waitin' to see your boss, huh?" Cas 182 "Y 'ain't told us yet what brung yuh up o 182 "Y'ain't told us yet what brung yuh up on 185 "More likely * White Mule '. ' ' Casey cocke 185 "More likely 'White Mule.'" Casey cocked a k 188 Casey grinned secretively. * * A man can 't b 188 Casey grinned secretively. "A man can't be pi 189 "He '11 kinda like to meet you, ' ' Joe returne 189 "He'll kinda like to meet you," Joe returned da 191 "Guess I got youm/' Hank leered, **when 191 "Guess I got yourn," Hank leered "when s 192 "If any one 's 'been usin ' a high-power i 192 "If any one's 'been usin' a high-power it 197 "We got to hold ye, ' ' Paw spoke up unc 197 "We got to hold ye," Paw spoke up unctio 198 "Aw, shut up. Paw, you oP fool, you," Han 198 "Aw, shut up, Paw, you ol' fool, you," Ha 199 "Well now, when it comes to fciWin'," Case 199 "Well now, when it comes to KILLIN'," Case 205 "Other feller hurt bad! " he inquired care 205 "Other feller hurt bad?" he inquired carel 229 So please don??t swallow those wild tales o 229 So please don't swallow those wild tales of 238 Four separate diarges of dynamite, he rea 238 Four separate charges of dynamite, he rea 250 "Good morning, ma 'am, * ' said Casey, clear 250 "Good morning, ma'am," said Casey, clearing 253 @@??The old woman dropped her hands to he 253 --??the old woman dropped her hands to he 262 "Pap says that you're a Federal officer! ' 262 "Pap says that you're a Federal officer!" 273 "All right, " Casey bore witness, keepin 273 "All right," Casey bore witness, keeping 274 "It's good hootch! " Joe declared impres 274 "It's GOOD hootch!" Joe declared impress 276 "Bet your life I can feel the kick! *?? h 276 "Bet your life I can feel the kick!" he a 279 "Ain 't writin ' no thin % ' ' Joe stated sol 279 "Ain't writin' nothin'," Joe stated solemnly. 280 "That's right, " nodded Casey and he added 280 "That's right," nodded Casey and he added, 284 "Dam right, that's right! I knew you wa 284 "Darn right, that's right! I knew you w 304 "How? *' questioned Paw, waggling hi 304 "How?" questioned Paw, waggling his 305 "Talk 'm t' death,'' Hank guessed with im 305 "Talk 'm t' death," Hank guessed with imb 306 "Think-I-can't? What 'U -- y 'bet 306 "Think-I-can't? What'll -- y'bet? 311 "Y ' watch 'im ! ' ' he barked, and the thre 311 "Y' watch 'im!" he barked, and the three tur 314 "'Z that a bumb! " Paw cackled nervousl 314 "'Z that a bumb?" Paw cackled nervously 315 "'Z goin' t' eat a bumb -- oP fool burro! 315 "'Z goin' t' eat a bumb -- ol' fool burro! 317 "Whereupon they drank to Casey solemnly 317 Whereupon they drank to Casey solemnly, 322 "Better take a brace uh hootch, '* Joe sug 322 "Better take a brace uh hootch," Joe sugge 323 Paw accepted this remark as high praise 323 Paw accepted this remark, as high prais 330 "I am sick! " Casey snarled, and poure 330 "I AM sick!" Casey snarled, and poured 332 "'J you tell 'im you made me drink it? 332 "'J you tell 'im you MADE me drink it?" 334 "Gratitude, hell! A lot I got in Ufe t' b 334 "Gratitude, hell! A lot I got in life t' 338 Homeless, friendless ; but Joe was his friend 338 Homeless, friendless; but Joe was his friend, 347 ???? What'r yuh tryin* to pull on me now? '* h 347 "What 'r yuh tryin' to pull on me now?" he baw 348 Through his slits of swollen lids Casey glare 348 "Casey Ryan! I'm dogged if it ain't Casey!" e 349 ?? ?? Take them irons off 'n my friends I ' ' bel 349 "Take them irons off'n my friends!" bellowed Case 351 ????Ah-h -- I know yuh -- think I don'tf 351 "Ah-h -- I know yuh think I don't? I know 354 Casey eyed him Wearily, not in the least mol 354 Casey eyed him blearily, not in the least mo 356 ??*Damn a pipe,'* Casey grumbled wit 356 "Damn a pipe," Casey grumbled with d 359 "Brung a coroner, did ynh, lookin' for som 359 "Brung a cor'ner, did yuh, lookin' for som 361 * ' I blowed up a jackass yesterday when the 361 "I blowed up a jackass yesterday when they t 363 "Missed 'im! " he grumbled disgustedly t 363 "Missed 'im!" he grumbled disgustedly to 371 ??* You'll have to do something about m 371 "You'll have to do something about my m 372 "What about your mother? '' the sherif 372 "What about your mother?" the sheriff 373 Mart swallowed. **She has a cabin to her 373 Mart swallowed. "She has a cabin to hers 383 - an ' all I ??ot to say is, Barney Oakes i 383 " -- an' all I got to say is, Barney Oakes 392 ??* Casey Ryan, you need a shave. And you 392 "Casey Ryan, you need a shave. And your s 395 ??* The sheriJBf who raided Black Butte ad 395 "The sheriff who raided Black Butte admitt 396 * ' He '11 forget it when he feels the ruin to hi 396 "He'll forget it when he feels the ruin to his fa 397 Babe sent yoa a pincushion she made i 397 "Babe sent you a pincushion she made 410 "You damned, drunken boob I ' ?? shouted th 410 "You damned, drunken boob!" shouted the new 417 I couldn 't blame Casey much for the mood h 417 I couldn't blame Casey much for the mood he 419 At the comer of the Plaza where traffic i 419 At the corner of the Plaza where traffic 421 "Get in, old-timer, * ?? invited the driver who 421 "Get in, old-timer," invited the driver whom Ca 422 "Fords are mean cusses,** he observed sym 422 "Fords are mean cusses," he observed symp 423 "Are you Casey Ryan? "The driver too 423 "Are you Casey Ryan?" the driver too 424 "Bill Masters sure had ought t' know me,* 424 "Bill Masters sure had ought t' know me," 430 Casey nodded appreciatively. "Every dam 430 Casey nodded appreciatively. "Every dar 431 "Yeah -- I guess L. A. 's a jinx for you al 431 "Yeah -- I guess L. A.'s a jinx for you all 435 "I got to telephone my wife I '* Casey ex 435 "I GOT to telephone my wife!" Casey excla 436 "Aw, you can 'phone from Fontana. I '1 436 "Aw, you can 'phone from Fontana. I'll 441 "Casey Ryan, tell me the truth. If you??r 441 "Casey Ryan, tell me the truth. If you're 442 "Sure as I??m standin' herel What make 442 "Sure as I'm standin' here! What makes 445 "Wanta drive ? ?? ' Casey ??s friend was rollin 445 "Wanta drive?" Casey's friend was rolling a smo 446 "Well, you can ask anybody if Casey Ryan ' 446 "Well, you can ask anybody if Casey Ryan's 447 "I believe it, Casey. Darned if I don??t 447 "I believe it, Casey. Darned if I don't. 448 "Say, if I drive till I ??m tiredy ?? ?? he retorted 448 "Say, if I drive till I'm TIRED," he retorted, "I'm 455 "You sure are some driver, '?? his new frien 455 "You sure are some driver," his new friend p 456 Casey therefore * * let 'er out ' ', and the For 456 Casey therefore "let 'er out", and the Ford went 458 "Too bad youVe made your pile already,?? 458 "Too bad you've made your pile already," 464 "When I git to thinkin ' about hittin ' out int 464 "When I git to thinkin' about hittin' out into 471 He paused ; and when he spoke again his ton 471 He paused; and when he, spoke again his ton 473 "?? Good idea," said Casey shortly, his ow 473 "Good idea," said Casey shortly, his own t 478 "They been tryin ' to make Casey Ryan ove 478 "They been tryin' to make Casey Ryan over 491 ???? I kinda thought it was you, Kenner,'* h 491 "I kinda thought it was you, Kenner," he dra 493 ??* What if I ain't got any! '* the young ma 493 "What if I ain't got any?" the young man par 496 ?? ?? I can 't argue with the law, ' ' he said, as h 496 "I can't argue with the law," he said, as he began t 497 The big man chuckled again. '* The law' 497 The big man chuckled again. "The law's 499 ?? * Slip me five hundred, anyway. How muc 499 "Slip me five hundred, anyway. How much is 500 * ?? Sixty gallons -- bottled, most of it. Tw 500 "Sixty gallons -- bottled, most of it. Two ke 501 * ' Pile out thirty gallons of the bottled good 501 "Pile out thirty gallons of the bottled goods b 505 "All right, pile in your blankets, '?? the bi 505 "All right, pile in your blankets," the big m 507 ?? * Aw, can 't yuh find some way to leave m 507 "Aw, can't yuh find some way to leave me jac 510 *?? Couldn't possibly. I have to have some 510 "Couldn't possibly. I have to have somethi 512 "Better keep right on going, boys. I 'd hat 512 "Better keep right on going, boys. I'd hate 516 "So that 's the kind uh game yuh asked me t 516 "So that's the kind uh game yuh asked me to 521 "Now, ynh take me, fer instance, I pla 521 "Now, yuh take me, fer instance. I pla 522 ??* Take this highjackin' to-night, for instance 522 "Take this highjackin' to-night, for instance. L 524 *?? Now there's a card you can slip up you 524 "Now there's a card you can slip up your s 525 "?? You noticed I got my gas-tank behind -- 525 "You noticed I got my gas-tank behind -- a t 526 The muscles along Casey's jaw had hardene 526 The muscles, along Casey's jaw had harden 527 * ?? Who says I 'm in ? Yuh ain 't heard Case 527 "Who says I'm in? Yuh ain't heard Casey Ryan 530 "You was drivin?? this car yourself whe 530 "You was drivin' this car yourself when 535 "Burros ain 't any extincter than what you ' 535 "Burros ain't any extincter than what you'll 536 Kenner laughed. ???? An' what would I b 536 Kenner laughed. "An' what would I be do 537 Casey drove as **purty " as was possibl 537 Casey drove as "purty" as was possible 544 "I've been thinking over your case/* Ken 544 "I've been thinking over your case," Ken 546 "I *ve changed my mind about havin * you fo 546 "I've changed my mind about havin' you for 547 "Now, the way I *ve doped this out, I *m goin 547 "Now, the way I've doped this out, I'm goin' t 549 "Why waitf Hand over the roll, and tha 549 "Why wait? Hand over the roll, and tha 552 Casey, still balef ully silent, emptied first on 552 Casey, still balefully silent, emptied first one 554 "Like hell I consummated the deal I ' ' Case 554 "Like hell I consummated the deal!" Casey wa 556 "Don 't take any bad money -- an ?? don 't le 556 "Don't take any bad money -- an' don't let 'e 560 The highway north from the Santa Fe Bail 560 The highway north from the Santa Fe Rail 562 "JUNIPER WELLS S 562 "JUNIPER WELLS 3 565 When a man has driven a Ford JBf teen hour 565 When a man has driven a Ford fifteen hours 569 "That sounds pretty businesslike, old man, * 569 "That sounds pretty businesslike, old man," a 571 "Where the hell did you come fromf '* h 571 "Where the hell did YOU come from?" he 572 "Does it matter ? I 'm here, ' ' the other par 572 "Does it matter? I'm here," the other parried 575 "All right -- if you 're willin ' to rustle th 575 "All right -- if you're willin' to rustle the 585 DxTBiNo the companionable smoke that fol 585 During the companionable smoke that foll 588 "* I Ve been out now for about three weeks 588 "I've been out now for about three weeks; a 592 ??* Go ahead an' take a nap if yuh want to, 592 "Go ahead an' take a nap if yuh want to," h 595 "The comer was never yet so tight tha 595 "The corner was never yet so tight th 601 ""Where's the piece you found?" he ver 601 "Where's the piece you found?" he very 604 "Now if that there lump uh high-grade ain ' 604 "Now if that there lump uh high-grade ain't 607 * ' The breakfast was fine, ' ' he replied easily 607 "The breakfast was fine," he replied easily. "A c 612 "Aw, hell ! ' * he muttered disgustedly, an 612 "Aw, hell!" he muttered disgustedly, and we 614 "By Jove, that was a fine sleep I had,'* h 614 "By Jove, that was a fine sleep I had," he 615 "Naw/* Casey's grunt was eloquent o 615 "Naw." Casey's grunt was eloquent o 616 "Get the car fixed all right! '* Mack Nolan' 616 "Get the car fixed all right?" Mack Nolan's 617 "Naw.'' Then Casey added grimly, "I' 617 "Naw." Then Casey added grimly, "I'm 621 "Aw, let the darned thing alone till we eat, ' 621 "Aw, let the darned thing alone till we eat," h 624 * ' Well, mebby I 'm kind of a crank about m 624 "Well, mebby I'm kind of a crank about my ca 625 "At the same time, ' ' he went on with risin 625 "At the same time," he went on with rising c 630 ""What sort of looking fellows were those 630 "What sort of looking fellows were those, 633 "It might help us both considerably,*' h 633 "It might help us both considerably," he 634 Casey puffed hard on his pipe. * * The world ' 634 Casey puffed hard on his pipe. "The world's gi 647 "I '11 do a little more guessing, now : I gues 647 "I'll do a little more guessing, now: I guess 648 "You go taheU ! ' ' growled Casey, swallow 648 "You go tahell!" growled Casey, swallowing 651 Casey grunted. "Chump is right, mebby 651 Casey grunted. "'Chump' is right, meb 661 "So it's White Mule you??re trailin\'' H 661 "So it's White Mule you're trailin'." He 664 "This Smiling Lou; you??d know him again 664 "This Smiling Lou; you'd know him again, 667 "Y\ih -- whatf'' In the firelight Casey' 667 "Yuh -- WHAT?" In the firelight Casey's 670 "There ??s something else that feller told m 670 "There's something else that feller told me 674 "Well, ' ' Casey said grimly, * * I dunno ho 674 "Well," Casey said grimly, "I dunno how scar 675 Again Mack Nolan laughed. "Catching ' 675 Again Mack Nolan laughed. "Catching's 679 "I think we *d better be moving from here be 679 "I think we'd better be moving from here bef 689 "And now, ' * said Nolan briskly, when he ha 689 "And now," said Nolan briskly, when he had h 692 "We??U have to rebottle all the whisky,'* sai 692 "We'll have to rebottle all the whisky," said 700 "There ain't a damn' bottle here! '' he bel 700 "There ain't a damn' bottle here!" he bello 706 "Water I ' * He snorted disgustedly. * * Case 706 "Water!" He snorted disgustedly. "Casey Ryan 708 "Which I wisht it wasn 't ! " snarled Casey 708 "Which I wisht it wasn't!" snarled Casey. " 710 "Aw, tahell with your White Mule I Tahel 710 "Aw, tahell with your White Mule! Tahell 721 "The thing's deeper than it looked yester 721 "The thing's deeper than it looked, yeste 723 "I??m absolutely certain, Casey, that if yo 723 "I'm absolutely certain, Casey, that if you 724 Casey sat up. * * Well, they coulda played m 724 Casey sat up. "Well, they coulda played me f 725 "All the more reason, ' ' said Nolan, als 725 "All the more reason," said Nolan, also s 729 "Nothing for it, Casey, -- we '11 have to lo 729 "Nothing for it, Casey, -- we'll have to lo 734 "You could go out and highjack some one, ' 734 "You could go out and highjack some one." N 735 "Casey studied the matter. "Bill Master 735 Casey studied the matter. "Bill Masters 738 Nolan ??s crisp tone of authority remained wit 738 Nolan's crisp tone of authority remained with 740 Mace Nolan had just crawled into his bun 740 Mack Nolan had just crawled into his bun 748 "Tis a good thing yuh left this other ca 748 "'Tis a good thing yuh left this other c 753 Casey raised to one elbow. * * When yuh tol 753 Casey raised to one elbow. "When yuh told C 754 ?? ' Good I I thought I hadn 't made a mistak 754 "Good! I thought I hadn't made a mistake in m 761 Casey fell asleep inmiediately afterward, bu 761 Casey fell asleep immediately afterward, but 763 "Well, the sheriff didn't arrive last night,' 763 "Well, the sheriff didn't arrive last night," 764 "It was a good job! "Casey maintained 764 "It was a GOOD job!" Casey maintained 770 Nolan laughed his easy little chuckle. * * Why 770 Nolan laughed his easy little chuckle. "Why, n 780 "Help me unload this stuff, Ryan,'* he said 780 "Help me unload this stuff, Ryan," he said, 782 "An' how many did you lick, Mr. Nolan? 782 "An' how many did YOU lick, Mr. Nolan?" 787 "When it 's as easy done as that, Mr. Nolan 787 "When it's as easy done as that, Mr. Nolan, 805 "I dunno -- nothin 's been picked up since 805 "I dunno -- nothin's been picked up since I 812 "What luck, Ryan ? I beat you back b 812 "What luck, Ryan? I beat you back by 813 "Nawl '* Casey spat disgustedly. "Neve 813 "Naw!" Casey spat disgustedly. "Never 816 You can 't wonder if relations were somewha 816 You can't wonder if relations were somewhat 818 Natubb had made Casey Ryan an optimist 818 Nature had made Casey Ryan an optimist 823 "They's times,'' said Casey, hopefully low 823 "They's times," said Casey, hopefully lowe 825 "* Yeah 1 ' ' Casey cocked a knowing eye at th 825 "Yeah?" Casey cocked a knowing eye at the spea 828 "Arizona, I see.'' The man nodded towar 828 "Arizona, I see." the man nodded toward 829 "TJh-huh. ' ' Casey glanced that way 829 "Uh-huh." Casey glanced that way. "K 831 ??' Some. Do y 831 "Some. Do you? 833 "Friend uh yours ? ' ' The fellow turned hi 833 "Friend uh yours?" the fellow turned his he 838 "Yeah! '' The self-styled Jim Cassid 838 "Yeah?" the self-styled Jim Cassidy 841 "You pass, ' ' he stated, with a relieved sigh 841 "You pass," he stated, with a relieved sigh. " 842 "You know 'im, all right.'' Casey als 842 "You know 'im, all right." Casey also 849 * ' Hullo ! Where 's your pardner ? * * he de 849 "Hullo! Where's your pardner?" he demanded th 850 "I 'm in pardnerships with myself this trip, ' 850 "I'm in pardnerships with myself this trip," Ca 851 "Where did you get that car 851 "Where did you get that car? 853 "Got any booze in that car! '* Smiling Lo 853 "Got any booze in that car?" Smiling Lou 855 * ' I wisht you wouldn *t look, * ' he said glumly 855 "I wisht you wouldn't look," he said glumly. "I go 860 "The boards is turned over on all the rest/ 860 "The boards is turned over on all the rest, 861 "What all have you got? '* Smiling Lou low 861 "What all have you got?" Smiling Lou lower 862 "Well, get it into my car, and make i 862 "Well, get it into my car, and make 866 "All right -- I to the goat/' he surrendere 866 "All right -- I'm the goat," he surrendered 868 After that. Smiling Lou started his moto 868 After that, Smiling Lou started his moto 870 "How much did he git off 'n youf '' he aske 870 "How much did he get off'n YOU?" he asked i 871 "Clean as a last year's bone in a kioty den/ 871 "Clean as a last year's bone in a kioty den, 872 "He wouldn't -- not mth you workin' o 872 "He wouldn't -- not with you workin' 874 "Oh, Lou's cute, all right They don't an 874 "Oh, Lou's cute, all right. They don't a 875 "Second trip, ' ' Casey informed him with a 875 "Second trip," Casey informed him with an a 877 "That'll suit me fine,'' Casey declared. An 877 "That'll suit me fine," Casey declared. And 888 "Where 'd you get this car? '' he demanded 888 "Where'd you get this car?" he demanded, i 889 "Bought it, ' ' Casey told him gruf 889 "Bought it," Casey told him gruffly 891 "Over at Goffs, just this side of Needles/ 891 "Over at Goffs, just this side of Needles. 892 "Got a bUl of sale? ' 892 "Got a bill of sale?" 893 "You got Casey Ryan 's word f er it, * * Case 893 "You got Casey Ryan's word fer it," Casey ret 894 "Are you Casey Ryan? '* The speed cop' 894 "Are you Casey Ryan?" the speed cop's 895 "Anybody says I ain *t, you send 'em to m 895 "Anybody says I ain't, you send 'em to me 899 "Heyl Don't I git paid fer my gas?" th 899 "Hey! Don't I git paid fer my gas?" th 900 "* Aw, go tahell I ' ' Casey grunted, and thre 900 "Aw, go tahell!" Casey grunted, and threw a wa 902 "Thro win' money around like a hootch-run 902 "Throwin' money around like a hootch-runn 903 Casey "got going.*' Twice on the way i 903 Casey "got going." Twice on the way in 905 Casey was booked -- along with * * To 905 Casey was booked -- along with "Tom S 907 He waited for an hour or two, Ustening wit 907 He waited for an hour or two, listening wi 913 Jim Cassidy still dung desperately to hi 913 Jim Cassidy still clung desperately to h 914 His chief desire now was to get oat of ther 914 His chief desire now was to get out of ther 919 At that it was a fooPs errand. Casey wa 919 At that it was a fool's errand. Casey w 920 "It's no use asking questions. Jack,*' th 920 "It's no use asking questions, Jack," the 936 Five miles east of Amboy, when a red sraise 936 Five miles east of Amboy, when a red sunset 941 "Why, hello, Ryan! " Madi Nolan greeted 941 "Why, hello, Ryan!" Mack Nolan greeted, 942 "Naw. This ain't no trouble," he grunted 942 "Naw. This ain't no trouble," he granted 945 "I'm pretty good at guessing,** he smiled 945 "I'm pretty good at guessing," he smiled. 950 "Now, of course, I*m talking like an ol 950 "Now, of course, I'm talking like an ol ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Fri Nov 21 13:02:23 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Fri, 21 Nov 2008 22:02:23 +0100 Subject: [gutvol-d] trail of the white mule -- 001 In-Reply-To: References: Message-ID: Well, do as you wish. I know of the cavets of the methods. Yours is a very crude big stick, But do not forget Goliath was felled by a small and refined tool. reagrds and good luck Keith. Am 21.11.2008 um 18:48 schrieb Bowerbird at aol.com: > keith said: > > another way to inprove this method and avoid > > unnecessary checks of words that are not in the dictionary > > is to use markov chaining analysis on the spelling. > > > > You could use a concordance to do different kinds > > of analysis or help with the above. This will help with > > idiosyncrasies of an author. > > > > Similarly, markov chains used on the word level can be used. > > > > Adding, a morphlogical component could be used. > > that's pretty good for an off-the-top response, keith. > > at the same time, it's out-of-touch with the _reality_ > of the relatively crudish nature of the task-at-hand. > > we don't need those sophisticated types of analysis; > nope, we need something that's a _lot_ more simple. > > for instance, let's consider the "paragraphing" data > that i just collected from the "white mule" reposting. > > what i do with that is to compare the first lines from > the paragraphs in both versions, and i see if/where > there are discrepancies. > > for this task, however, i'm looking for a wild mismatch, > where one line is _completely_ different from the other. > since we're looking for cases where a paragraph was > inadvertently split, or incorrectly merged with another. > > so mismatches that result from simple o.c.r. differences > are to be ignored in this task. > > i've appended the data from this comparison for you, > because it illustrates the arena in which we're working. > > with one exception (noted later), these lines "match up". > so _these_cases_ are what o.c.r. errors look like today... > these are the types of corrections that we have to make. > > if you look at those errors, you'll see they're very crude. > > i sincerely doubt that markov chaining will help much, > or any "sophisticated" technique. for these changes, > you need to employ a big stick, not a refined analysis... > > oh, and i already employed one such "big stick" here. > there were lots and lots of cases with a double-quote > at the line-start that were misrecognized as asterisks. > > indeed, there were 257 such misreads as "** ". _257_. > easy to say "do a blind global change on those babies", > and be quite confident there won't be any bad changes. > > i am sure if you peruse the appended lines, you will find > many more global changes that would also be beneficial. > > more importantly, you will grasp the _reality_ of our task. > > *** > > also, i now see one more place where there might be a > paragraphing problem in the project gutenberg repost, > namely paragraph 348, which it appears is a mismatch. > > yep, the p.g. repost has a paragraphing error here: > > Through his slits of swollen lids Casey glared > > http://z-m-l.com/go/wmule/wmulep110.jpg > > http://www.gutenberg.org/files/2063/2063.txt > > so that's _2_ paragraphing errors on this reposting... > > *** > > by the way, whenever you look at differences like this, > you're bound to notice the occasional error by chance. > > such was the case here. > > if you look at paragraph 942, you'll see this: > > 942 "Naw. This ain't no trouble," he grunted > > 942 "Naw. This ain't no trouble," he granted > > the top line -- from the o.c.r. -- looks right (grunted), > while the bottom line -- from the p.g. reposting -- > looks questionable (granted). sure enough, the scan > says "grunted". so it looks like we need to start a list... > > -bowerbird > > 4 "This'U mean thirty days for you," splut > 4 "This'll mean thirty days for you," splu > > 5 "Get the undertaker on the line first 1 > 5 "Get the undertaker on the line first!" > > 9 ""Why do you persist in making trouble fo > 9 "Why do you persist in making trouble for > > 11 "All right, ma'am. You can drive, then.* > 11 "All right, ma'am. You can drive, then." > > 13 "You was in a hurry to git home, ' * Case > 13 "You was in a hurry to git home," Casey p > > 15 "Police be danmed! I'm tryin* to please > 15 "Police be damned! I'm tryin' to please > > 16 "Heyl "Whadda yuh mean, blockin' th > 16 "Hey! Whadda yuh mean, blockin' the > > 20 "It's coining to a show-down, Jack,'* sh > 20 "It's coming to a show-down, Jack," she > > 22 "It sounds much better, putting it that way, ' > 22 "It sounds much better, putting it that way," m > > 23 "It isn't spoiled,'* I grinned. "Case > 23 "It isn't spoiled," I grinned. "Casey > > 29 "Well-sir, ' * she drawled, making one wor > 29 "Well-sir," she drawled, making one word o > > 32 "He d??d, yes. But his idea of luxury is sit > 32 "He did, yes. But his idea of luxury is sitt > > 34 "And you know, * ' she went on, shuffling th > 34 "And you know," she went on, shuffling the c > > 36 "He did one good stroke of business,'* I ven > 36 "He did one good stroke of business," I vent > > 48 "Well-sir, he 's gone I * * she announced, an > 48 "Well-sir, he's gone!" she announced, and sto > > 57 He was fortunate in buying a demonstrator * > 57 He was fortunate in buying a demonstrator's > > 61 He was teUing himself that he didn't car > 61 He was telling himself that he didn't ca > > 62 He climbed stiflSy out, squinted at the sk > 62 He climbed stiffly out, squinted at the sk > > 63 "If they wanta come pinch me here, 1*11 mee > 63 "If they wanta come pinch me here, I'll mee > > 65 "Hell! " said Casey, breathing deep when > 65 "Hell!" said Casey, breathing deep when, > > 84 "Whereas that there Joshuay tree pointin? > 84 "Where's that there Joshuay tree pointin' > > 96 "Somebody ??s gunnin* fer us, looks like t? > 96 "Somebody's gunnin' fer us, looks like t' m > > 98 "So'd I,'' boasted Barney, "but that ain?? > 98 "So'd I," boasted Barney, "but that ain't > > 99 "What I'm figurin' out now,'' said Casey > 99 "What I'm figurin' out now," said Casey, > > 103 "What yuh figurin' on doin'T " Case > 103 "What yuh figurin' on doin'?" Casey > > 110 Bamey, therefore, dug like a badger with > 110 Barney, therefore, dug like a badger with > > 112 Standing there pufling and wondering what t > 112 Standing there puffing and wondering what t > > 126 So here was what the boulder concealed, - > 126 So here was what the boulder concealed, > > 134 Away from the cabin a pebble *s throw, h > 134 Away from the cabin a pebble's throw, he > > 145 Against the limitations prescribed by his ma > 145 Against the limitations proscribed by his ma > > 157 "Git up off 'n him! " a new voice commande > 157 "Git up off'n him!" a new voice commanded > > 160 "Paw, you oP fool, you, get your finge > 160 "Paw, you ol' fool, you, get your fing > > 162 "* Aw, you shut up, Paw. You ain 't gittin > 162 "Aw, you shut up, Paw. You ain't gittin' no > > 166 "I'd fix 'im here an' now," threatened Paw > 166 "I'd fix 'im, here an', now," threatened P > > 171 "Ever drill in rock? '?? he asked shor > 171 "Ever drill in rock?" he asked shortly > > 172 "Mebbe I have an' mebbe I ain't,'' Case > 172 "Mebbe I have an' mebbe I ain't," Casey > > 173 "Here's a drill, an' here's your single- jack > 173 "Here's a drill, an' here's your single-jack. > > 177 "Two of us waitin * to see your boss, huh ? ' > 177 "Two of us waitin' to see your boss, huh?" Cas > > 182 "Y 'ain't told us yet what brung yuh up o > 182 "Y'ain't told us yet what brung yuh up on > > 185 "More likely * White Mule '. ' ' Casey cocke > 185 "More likely 'White Mule.'" Casey cocked a k > > 188 Casey grinned secretively. * * A man can 't b > 188 Casey grinned secretively. "A man can't be pi > > 189 "He '11 kinda like to meet you, ' ' Joe returne > 189 "He'll kinda like to meet you," Joe returned da > > 191 "Guess I got youm/' Hank leered, **when > 191 "Guess I got yourn," Hank leered "when s > > 192 "If any one 's 'been usin ' a high-power i > 192 "If any one's 'been usin' a high-power it > > 197 "We got to hold ye, ' ' Paw spoke up unc > 197 "We got to hold ye," Paw spoke up unctio > > 198 "Aw, shut up. Paw, you oP fool, you," Han > 198 "Aw, shut up, Paw, you ol' fool, you," Ha > > 199 "Well now, when it comes to fciWin'," Case > 199 "Well now, when it comes to KILLIN'," Case > > 205 "Other feller hurt bad! " he inquired care > 205 "Other feller hurt bad?" he inquired carel > > 229 So please don??t swallow those wild tales o > 229 So please don't swallow those wild tales of > > 238 Four separate diarges of dynamite, he rea > 238 Four separate charges of dynamite, he rea > > 250 "Good morning, ma 'am, * ' said Casey, clear > 250 "Good morning, ma'am," said Casey, clearing > > 253 @@??The old woman dropped her hands to he > 253 --??the old woman dropped her hands to he > > 262 "Pap says that you're a Federal officer! ' > 262 "Pap says that you're a Federal officer!" > > 273 "All right, " Casey bore witness, keepin > 273 "All right," Casey bore witness, keeping > > 274 "It's good hootch! " Joe declared impres > 274 "It's GOOD hootch!" Joe declared impress > > 276 "Bet your life I can feel the kick! *?? h > 276 "Bet your life I can feel the kick!" he a > > 279 "Ain 't writin ' no thin % ' ' Joe stated sol > 279 "Ain't writin' nothin'," Joe stated solemnly. > > 280 "That's right, " nodded Casey and he added > 280 "That's right," nodded Casey and he added, > > 284 "Dam right, that's right! I knew you wa > 284 "Darn right, that's right! I knew you w > > 304 "How? *' questioned Paw, waggling hi > 304 "How?" questioned Paw, waggling his > > 305 "Talk 'm t' death,'' Hank guessed with im > 305 "Talk 'm t' death," Hank guessed with imb > > 306 "Think-I-can't? What 'U -- y 'bet > 306 "Think-I-can't? What'll -- y'bet? > > 311 "Y ' watch 'im ! ' ' he barked, and the thre > 311 "Y' watch 'im!" he barked, and the three tur > > 314 "'Z that a bumb! " Paw cackled nervousl > 314 "'Z that a bumb?" Paw cackled nervously > > 315 "'Z goin' t' eat a bumb -- oP fool burro! > 315 "'Z goin' t' eat a bumb -- ol' fool burro! > > 317 "Whereupon they drank to Casey solemnly > 317 Whereupon they drank to Casey solemnly, > > 322 "Better take a brace uh hootch, '* Joe sug > 322 "Better take a brace uh hootch," Joe sugge > > 323 Paw accepted this remark as high praise > 323 Paw accepted this remark, as high prais > > 330 "I am sick! " Casey snarled, and poure > 330 "I AM sick!" Casey snarled, and poured > > 332 "'J you tell 'im you made me drink it? > 332 "'J you tell 'im you MADE me drink it?" > > 334 "Gratitude, hell! A lot I got in Ufe t' b > 334 "Gratitude, hell! A lot I got in life t' > > 338 Homeless, friendless ; but Joe was his friend > 338 Homeless, friendless; but Joe was his friend, > > 347 ???? What'r yuh tryin* to pull on me now? '* h > 347 "What 'r yuh tryin' to pull on me now?" he baw > > 348 Through his slits of swollen lids Casey glare > 348 "Casey Ryan! I'm dogged if it ain't Casey!" e > > 349 ?? ?? Take them irons off 'n my friends I ' ' bel > 349 "Take them irons off'n my friends!" bellowed Case > > 351 ????Ah-h -- I know yuh -- think I don'tf > 351 "Ah-h -- I know yuh think I don't? I know > > 354 Casey eyed him Wearily, not in the least mol > 354 Casey eyed him blearily, not in the least mo > > 356 ??*Damn a pipe,'* Casey grumbled wit > 356 "Damn a pipe," Casey grumbled with d > > 359 "Brung a coroner, did ynh, lookin' for som > 359 "Brung a cor'ner, did yuh, lookin' for som > > 361 * ' I blowed up a jackass yesterday when the > 361 "I blowed up a jackass yesterday when they t > > 363 "Missed 'im! " he grumbled disgustedly t > 363 "Missed 'im!" he grumbled disgustedly to > > 371 ??* You'll have to do something about m > 371 "You'll have to do something about my m > > 372 "What about your mother? '' the sherif > 372 "What about your mother?" the sheriff > > 373 Mart swallowed. **She has a cabin to her > 373 Mart swallowed. "She has a cabin to hers > > 383 - an ' all I ??ot to say is, Barney Oakes i > 383 " -- an' all I got to say is, Barney Oakes > > 392 ??* Casey Ryan, you need a shave. And you > 392 "Casey Ryan, you need a shave. And your s > > 395 ??* The sheriJBf who raided Black Butte ad > 395 "The sheriff who raided Black Butte admitt > > 396 * ' He '11 forget it when he feels the ruin to hi > 396 "He'll forget it when he feels the ruin to his fa > > 397 Babe sent yoa a pincushion she made i > 397 "Babe sent you a pincushion she made > > 410 "You damned, drunken boob I ' ?? shouted th > 410 "You damned, drunken boob!" shouted the new > > 417 I couldn 't blame Casey much for the mood h > 417 I couldn't blame Casey much for the mood he > > 419 At the comer of the Plaza where traffic i > 419 At the corner of the Plaza where traffic > > 421 "Get in, old-timer, * ?? invited the driver who > 421 "Get in, old-timer," invited the driver whom Ca > > 422 "Fords are mean cusses,** he observed sym > 422 "Fords are mean cusses," he observed symp > > 423 "Are you Casey Ryan? "The driver too > 423 "Are you Casey Ryan?" the driver too > > 424 "Bill Masters sure had ought t' know me,* > 424 "Bill Masters sure had ought t' know me," > > 430 Casey nodded appreciatively. "Every dam > 430 Casey nodded appreciatively. "Every dar > > 431 "Yeah -- I guess L. A. 's a jinx for you al > 431 "Yeah -- I guess L. A.'s a jinx for you all > > 435 "I got to telephone my wife I '* Casey ex > 435 "I GOT to telephone my wife!" Casey excla > > 436 "Aw, you can 'phone from Fontana. I '1 > 436 "Aw, you can 'phone from Fontana. I'll > > 441 "Casey Ryan, tell me the truth. If you??r > 441 "Casey Ryan, tell me the truth. If you're > > 442 "Sure as I??m standin' herel What make > 442 "Sure as I'm standin' here! What makes > > 445 "Wanta drive ? ?? ' Casey ??s friend was rollin > 445 "Wanta drive?" Casey's friend was rolling a smo > > 446 "Well, you can ask anybody if Casey Ryan ' > 446 "Well, you can ask anybody if Casey Ryan's > > 447 "I believe it, Casey. Darned if I don??t > 447 "I believe it, Casey. Darned if I don't. > > 448 "Say, if I drive till I ??m tiredy ?? ?? he retorted > 448 "Say, if I drive till I'm TIRED," he retorted, "I'm > > 455 "You sure are some driver, '?? his new frien > 455 "You sure are some driver," his new friend p > > 456 Casey therefore * * let 'er out ' ', and the For > 456 Casey therefore "let 'er out", and the Ford went > > 458 "Too bad youVe made your pile already,?? > 458 "Too bad you've made your pile already," > > 464 "When I git to thinkin ' about hittin ' out int > 464 "When I git to thinkin' about hittin' out into > > 471 He paused ; and when he spoke again his ton > 471 He paused; and when he, spoke again his ton > > 473 "?? Good idea," said Casey shortly, his ow > 473 "Good idea," said Casey shortly, his own t > > 478 "They been tryin ' to make Casey Ryan ove > 478 "They been tryin' to make Casey Ryan over > > 491 ???? I kinda thought it was you, Kenner,'* h > 491 "I kinda thought it was you, Kenner," he dra > > 493 ??* What if I ain't got any! '* the young ma > 493 "What if I ain't got any?" the young man par > > 496 ?? ?? I can 't argue with the law, ' ' he said, as h > 496 "I can't argue with the law," he said, as he began t > > 497 The big man chuckled again. '* The law' > 497 The big man chuckled again. "The law's > > 499 ?? * Slip me five hundred, anyway. How muc > 499 "Slip me five hundred, anyway. How much is > > 500 * ?? Sixty gallons -- bottled, most of it. Tw > 500 "Sixty gallons -- bottled, most of it. Two ke > > 501 * ' Pile out thirty gallons of the bottled good > 501 "Pile out thirty gallons of the bottled goods b > > 505 "All right, pile in your blankets, '?? the bi > 505 "All right, pile in your blankets," the big m > > 507 ?? * Aw, can 't yuh find some way to leave m > 507 "Aw, can't yuh find some way to leave me jac > > 510 *?? Couldn't possibly. I have to have some > 510 "Couldn't possibly. I have to have somethi > > 512 "Better keep right on going, boys. I 'd hat > 512 "Better keep right on going, boys. I'd hate > > 516 "So that 's the kind uh game yuh asked me t > 516 "So that's the kind uh game yuh asked me to > > 521 "Now, ynh take me, fer instance, I pla > 521 "Now, yuh take me, fer instance. I pla > > 522 ??* Take this highjackin' to-night, for instance > 522 "Take this highjackin' to-night, for instance. L > > 524 *?? Now there's a card you can slip up you > 524 "Now there's a card you can slip up your s > > 525 "?? You noticed I got my gas-tank behind -- > 525 "You noticed I got my gas-tank behind -- a t > > 526 The muscles along Casey's jaw had hardene > 526 The muscles, along Casey's jaw had harden > > 527 * ?? Who says I 'm in ? Yuh ain 't heard Case > 527 "Who says I'm in? Yuh ain't heard Casey Ryan > > 530 "You was drivin?? this car yourself whe > 530 "You was drivin' this car yourself when > > 535 "Burros ain 't any extincter than what you ' > 535 "Burros ain't any extincter than what you'll > > 536 Kenner laughed. ???? An' what would I b > 536 Kenner laughed. "An' what would I be do > > 537 Casey drove as **purty " as was possibl > 537 Casey drove as "purty" as was possible > > 544 "I've been thinking over your case/* Ken > 544 "I've been thinking over your case," Ken > > 546 "I *ve changed my mind about havin * you fo > 546 "I've changed my mind about havin' you for > > 547 "Now, the way I *ve doped this out, I *m goin > 547 "Now, the way I've doped this out, I'm goin' t > > 549 "Why waitf Hand over the roll, and tha > 549 "Why wait? Hand over the roll, and tha > > 552 Casey, still balef ully silent, emptied first on > 552 Casey, still balefully silent, emptied first one > > 554 "Like hell I consummated the deal I ' ' Case > 554 "Like hell I consummated the deal!" Casey wa > > 556 "Don 't take any bad money -- an ?? don 't le > 556 "Don't take any bad money -- an' don't let 'e > > 560 The highway north from the Santa Fe Bail > 560 The highway north from the Santa Fe Rail > > 562 "JUNIPER WELLS S > 562 "JUNIPER WELLS 3 > > 565 When a man has driven a Ford JBf teen hour > 565 When a man has driven a Ford fifteen hours > > 569 "That sounds pretty businesslike, old man, * > 569 "That sounds pretty businesslike, old man," a > > 571 "Where the hell did you come fromf '* h > 571 "Where the hell did YOU come from?" he > > 572 "Does it matter ? I 'm here, ' ' the other par > 572 "Does it matter? I'm here," the other parried > > 575 "All right -- if you 're willin ' to rustle th > 575 "All right -- if you're willin' to rustle the > > 585 DxTBiNo the companionable smoke that fol > 585 During the companionable smoke that foll > > 588 "* I Ve been out now for about three weeks > 588 "I've been out now for about three weeks; a > > 592 ??* Go ahead an' take a nap if yuh want to, > 592 "Go ahead an' take a nap if yuh want to," h > > 595 "The comer was never yet so tight tha > 595 "The corner was never yet so tight th > > 601 ""Where's the piece you found?" he ver > 601 "Where's the piece you found?" he very > > 604 "Now if that there lump uh high-grade ain ' > 604 "Now if that there lump uh high-grade ain't > > 607 * ' The breakfast was fine, ' ' he replied easily > 607 "The breakfast was fine," he replied easily. "A c > > 612 "Aw, hell ! ' * he muttered disgustedly, an > 612 "Aw, hell!" he muttered disgustedly, and we > > 614 "By Jove, that was a fine sleep I had,'* h > 614 "By Jove, that was a fine sleep I had," he > > 615 "Naw/* Casey's grunt was eloquent o > 615 "Naw." Casey's grunt was eloquent o > > 616 "Get the car fixed all right! '* Mack Nolan' > 616 "Get the car fixed all right?" Mack Nolan's > > 617 "Naw.'' Then Casey added grimly, "I' > 617 "Naw." Then Casey added grimly, "I'm > > 621 "Aw, let the darned thing alone till we eat, ' > 621 "Aw, let the darned thing alone till we eat," h > > 624 * ' Well, mebby I 'm kind of a crank about m > 624 "Well, mebby I'm kind of a crank about my ca > > 625 "At the same time, ' ' he went on with risin > 625 "At the same time," he went on with rising c > > 630 ""What sort of looking fellows were those > 630 "What sort of looking fellows were those, > > 633 "It might help us both considerably,*' h > 633 "It might help us both considerably," he > > 634 Casey puffed hard on his pipe. * * The world ' > 634 Casey puffed hard on his pipe. "The world's gi > > 647 "I '11 do a little more guessing, now : I gues > 647 "I'll do a little more guessing, now: I guess > > 648 "You go taheU ! ' ' growled Casey, swallow > 648 "You go tahell!" growled Casey, swallowing > > 651 Casey grunted. "Chump is right, mebby > 651 Casey grunted. "'Chump' is right, meb > > 661 "So it's White Mule you??re trailin\'' H > 661 "So it's White Mule you're trailin'." He > > 664 "This Smiling Lou; you??d know him again > 664 "This Smiling Lou; you'd know him again, > > 667 "Y\ih -- whatf'' In the firelight Casey' > 667 "Yuh -- WHAT?" In the firelight Casey's > > 670 "There ??s something else that feller told m > 670 "There's something else that feller told me > > 674 "Well, ' ' Casey said grimly, * * I dunno ho > 674 "Well," Casey said grimly, "I dunno how scar > > 675 Again Mack Nolan laughed. "Catching ' > 675 Again Mack Nolan laughed. "Catching's > > 679 "I think we *d better be moving from here be > 679 "I think we'd better be moving from here bef > > 689 "And now, ' * said Nolan briskly, when he ha > 689 "And now," said Nolan briskly, when he had h > > 692 "We??U have to rebottle all the whisky,'* sai > 692 "We'll have to rebottle all the whisky," said > > 700 "There ain't a damn' bottle here! '' he bel > 700 "There ain't a damn' bottle here!" he bello > > 706 "Water I ' * He snorted disgustedly. * * Case > 706 "Water!" He snorted disgustedly. "Casey Ryan > > 708 "Which I wisht it wasn 't ! " snarled Casey > 708 "Which I wisht it wasn't!" snarled Casey. " > > 710 "Aw, tahell with your White Mule I Tahel > 710 "Aw, tahell with your White Mule! Tahell > > 721 "The thing's deeper than it looked yester > 721 "The thing's deeper than it looked, yeste > > 723 "I??m absolutely certain, Casey, that if yo > 723 "I'm absolutely certain, Casey, that if you > > 724 Casey sat up. * * Well, they coulda played m > 724 Casey sat up. "Well, they coulda played me f > > 725 "All the more reason, ' ' said Nolan, als > 725 "All the more reason," said Nolan, also s > > 729 "Nothing for it, Casey, -- we '11 have to lo > 729 "Nothing for it, Casey, -- we'll have to lo > > 734 "You could go out and highjack some one, ' > 734 "You could go out and highjack some one." N > > 735 "Casey studied the matter. "Bill Master > 735 Casey studied the matter. "Bill Masters > > 738 Nolan ??s crisp tone of authority remained wit > 738 Nolan's crisp tone of authority remained with > > 740 Mace Nolan had just crawled into his bun > 740 Mack Nolan had just crawled into his bun > > 748 "Tis a good thing yuh left this other ca > 748 "'Tis a good thing yuh left this other c > > 753 Casey raised to one elbow. * * When yuh tol > 753 Casey raised to one elbow. "When yuh told C > > 754 ?? ' Good I I thought I hadn 't made a mistak > 754 "Good! I thought I hadn't made a mistake in m > > 761 Casey fell asleep inmiediately afterward, bu > 761 Casey fell asleep immediately afterward, but > > 763 "Well, the sheriff didn't arrive last night,' > 763 "Well, the sheriff didn't arrive last night," > > 764 "It was a good job! "Casey maintained > 764 "It was a GOOD job!" Casey maintained > > 770 Nolan laughed his easy little chuckle. * * Why > 770 Nolan laughed his easy little chuckle. "Why, n > > 780 "Help me unload this stuff, Ryan,'* he said > 780 "Help me unload this stuff, Ryan," he said, > > 782 "An' how many did you lick, Mr. Nolan? > 782 "An' how many did YOU lick, Mr. Nolan?" > > 787 "When it 's as easy done as that, Mr. Nolan > 787 "When it's as easy done as that, Mr. Nolan, > > 805 "I dunno -- nothin 's been picked up since > 805 "I dunno -- nothin's been picked up since I > > 812 "What luck, Ryan ? I beat you back b > 812 "What luck, Ryan? I beat you back by > > 813 "Nawl '* Casey spat disgustedly. "Neve > 813 "Naw!" Casey spat disgustedly. "Never > > 816 You can 't wonder if relations were somewha > 816 You can't wonder if relations were somewhat > > 818 Natubb had made Casey Ryan an optimist > 818 Nature had made Casey Ryan an optimist > > 823 "They's times,'' said Casey, hopefully low > 823 "They's times," said Casey, hopefully lowe > > 825 "* Yeah 1 ' ' Casey cocked a knowing eye at th > 825 "Yeah?" Casey cocked a knowing eye at the spea > > 828 "Arizona, I see.'' The man nodded towar > 828 "Arizona, I see." the man nodded toward > > 829 "TJh-huh. ' ' Casey glanced that way > 829 "Uh-huh." Casey glanced that way. "K > > 831 ??' Some. Do y > 831 "Some. Do you? > > 833 "Friend uh yours ? ' ' The fellow turned hi > 833 "Friend uh yours?" the fellow turned his he > > 838 "Yeah! '' The self-styled Jim Cassid > 838 "Yeah?" the self-styled Jim Cassidy > > 841 "You pass, ' ' he stated, with a relieved sigh > 841 "You pass," he stated, with a relieved sigh. " > > 842 "You know 'im, all right.'' Casey als > 842 "You know 'im, all right." Casey also > > 849 * ' Hullo ! Where 's your pardner ? * * he de > 849 "Hullo! Where's your pardner?" he demanded th > > 850 "I 'm in pardnerships with myself this trip, ' > 850 "I'm in pardnerships with myself this trip," Ca > > 851 "Where did you get that car > 851 "Where did you get that car? > > 853 "Got any booze in that car! '* Smiling Lo > 853 "Got any booze in that car?" Smiling Lou > > 855 * ' I wisht you wouldn *t look, * ' he said glumly > 855 "I wisht you wouldn't look," he said glumly. "I go > > 860 "The boards is turned over on all the rest/ > 860 "The boards is turned over on all the rest, > > 861 "What all have you got? '* Smiling Lou low > 861 "What all have you got?" Smiling Lou lower > > 862 "Well, get it into my car, and make i > 862 "Well, get it into my car, and make > > 866 "All right -- I to the goat/' he surrendere > 866 "All right -- I'm the goat," he surrendered > > 868 After that. Smiling Lou started his moto > 868 After that, Smiling Lou started his moto > > 870 "How much did he git off 'n youf '' he aske > 870 "How much did he get off'n YOU?" he asked i > > 871 "Clean as a last year's bone in a kioty den/ > 871 "Clean as a last year's bone in a kioty den, > > 872 "He wouldn't -- not mth you workin' o > 872 "He wouldn't -- not with you workin' > > 874 "Oh, Lou's cute, all right They don't an > 874 "Oh, Lou's cute, all right. They don't a > > 875 "Second trip, ' ' Casey informed him with a > 875 "Second trip," Casey informed him with an a > > 877 "That'll suit me fine,'' Casey declared. An > 877 "That'll suit me fine," Casey declared. And > > 888 "Where 'd you get this car? '' he demanded > 888 "Where'd you get this car?" he demanded, i > > 889 "Bought it, ' ' Casey told him gruf > 889 "Bought it," Casey told him gruffly > > 891 "Over at Goffs, just this side of Needles/ > 891 "Over at Goffs, just this side of Needles. > > 892 "Got a bUl of sale? ' > 892 "Got a bill of sale?" > > 893 "You got Casey Ryan 's word f er it, * * Case > 893 "You got Casey Ryan's word fer it," Casey ret > > 894 "Are you Casey Ryan? '* The speed cop' > 894 "Are you Casey Ryan?" the speed cop's > > 895 "Anybody says I ain *t, you send 'em to m > 895 "Anybody says I ain't, you send 'em to me > > 899 "Heyl Don't I git paid fer my gas?" th > 899 "Hey! Don't I git paid fer my gas?" th > > 900 "* Aw, go tahell I ' ' Casey grunted, and thre > 900 "Aw, go tahell!" Casey grunted, and threw a wa > > 902 "Thro win' money around like a hootch-run > 902 "Throwin' money around like a hootch-runn > > 903 Casey "got going.*' Twice on the way i > 903 Casey "got going." Twice on the way in > > 905 Casey was booked -- along with * * To > 905 Casey was booked -- along with "Tom S > > 907 He waited for an hour or two, Ustening wit > 907 He waited for an hour or two, listening wi > > 913 Jim Cassidy still dung desperately to hi > 913 Jim Cassidy still clung desperately to h > > 914 His chief desire now was to get oat of ther > 914 His chief desire now was to get out of ther > > 919 At that it was a fooPs errand. Casey wa > 919 At that it was a fool's errand. Casey w > > 920 "It's no use asking questions. Jack,*' th > 920 "It's no use asking questions, Jack," the > > 936 Five miles east of Amboy, when a red sraise > 936 Five miles east of Amboy, when a red sunset > > 941 "Why, hello, Ryan! " Madi Nolan greeted > 941 "Why, hello, Ryan!" Mack Nolan greeted, > > 942 "Naw. This ain't no trouble," he grunted > 942 "Naw. This ain't no trouble," he granted > > 945 "I'm pretty good at guessing,** he smiled > 945 "I'm pretty good at guessing," he smiled. > > 950 "Now, of course, I*m talking like an ol > 950 "Now, of course, I'm talking like an ol > > > > ************** > One site has it all. Your email accounts, your social networks, and > the things you love. Try the new AOL.com today!(http:// > pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol? > redir=http://www.aol.com/?optin=new-dp%26icid=aolcom40vanity% > 26ncid=emlcntaolcom00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Nov 21 13:49:32 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 21 Nov 2008 16:49:32 EST Subject: [gutvol-d] trail of the white mule -- 001 Message-ID: keith said: > Yours is a very crude big stick, why, thank you. i think... > But do not forget Goliath was > felled by a small and refined tool. don't you worry, i got a slingshot too. and i can put a rocket in my slingshot. -bowerbird p.s. for those who are keeping track, we've got one more paragraphing error: > He led it conquered back to the Ford, tied it > http://z-m-l.com/go/wmule/wmulep028.jpg ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sat Nov 22 12:14:58 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 22 Nov 2008 15:14:58 EST Subject: [gutvol-d] trail of the white mule -- 002 Message-ID: we're checking the reposted "trail of the white mule", #2063. we now have the text up on my site, not just the images... note that i used the o.c.r. from archive.org -- as that is the only text at this point with the proper linebreaks -- and it's still badly-flawed, largely-raw, uncorrected o.c.r. but at least we can cruise through the book easily now, and the discussion will be served by better navigation... > http://z-m-l.com/go/wmule/wmulep123.html -bowerbird p.s. latest happened-to-notice error in the reposting: > Casey Ryan crawled out and looked up at her grinning sheepishly. > Casey Ryan crawled out and looked up at her, grinning sheepishly. > http://z-m-l.com/go/wmule/wmulep275.html ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Sun Nov 23 01:06:26 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Sun, 23 Nov 2008 10:06:26 +0100 Subject: [gutvol-d] trail of the white mule -- 001 In-Reply-To: References: Message-ID: Hi BB, Am 21.11.2008 um 22:49 schrieb Bowerbird at aol.com: > keith said: > > Yours is a very crude big stick, > > why, thank you. Your are welcome. > > > i think... > > > > But do not forget Goliath was > > felled by a small and refined tool. > > don't you worry, i got a slingshot too. Here is the I think ! > > > and i can put a rocket in my slingshot. Do not forget the guidance system! Stealth technology would be great, too ;-)). Regards Keith. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sun Nov 23 11:08:46 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sun, 23 Nov 2008 14:08:46 EST Subject: [gutvol-d] trail of the white mule -- 003 Message-ID: we're checking the reposted "trail of the white mule", #2063. *** the final aspect of the paragraphing check revealed two errors, both seeming to revolve on where a word ("granite") appeared, with "huge, granite" missing at one point, then "granite ledge" being inserted incorrectly at another point a few lines down... > the arc of his vision was a ledge, > the arc of his vision was a huge, granite ledge > http://z-m-l.com/go/wmulep040.html > cracks where moisture longest remained; granite ledge > cracks where moisture longest remained; > http://z-m-l.com/go/wmulep040.html since the improper insertion occurred at a page-break, this might be the result of the not-totally-infrequent o.c.r. error where some word or phrase that occurs mid-page is dropped to page-bottom (sometimes because the o.c.r. thinks that it was "another column"). that doesn't appear to fit the situation all that well here, but i am unable to see any another explanation, other than human error... at any rate, we've hit al's "magic number" of 5 errors on this repost. and we haven't even really started checking yet. not a good sign... *** i have now loaded the project gutenberg text up onto my site, as it now has the correct line-breaks, instead of the o.c.a. text. the p.g. e-text is named "wmule.pgr" -- for "p.g. reposting" -- and the o.c.a. text is named "wmule.oca". remember at this point, both the texts are considered undone, and might be flawed. the p.g. text is obviously much cleaner, but a comparison between the two texts will show how clean... enjoy your weekend! :+) -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sun Nov 23 15:31:36 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sun, 23 Nov 2008 18:31:36 EST Subject: [gutvol-d] trail of the white mule -- 003 Message-ID: i made a mistake on those u.r.l. pointers... > the arc of his vision was a ledge, > the arc of his vision was a huge, granite ledge > http://z-m-l.com/go/wmule/wmulep040.html > cracks where moisture longest remained; granite ledge > cracks where moisture longest remained; > http://z-m-l.com/go/wmule/wmulep040.html i'm such a bungler... ;+) -bowerbird p.s. by now, though, you should understand the structure of the links on my site, and be able to know if one is wrong. ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Nov 24 00:03:15 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 24 Nov 2008 03:03:15 EST Subject: [gutvol-d] trail of the white mule -- 004 Message-ID: we're checking the reposted "trail of the white mule", #2063. *** welcome back to the work-week... now that our paragraphs are in synch, we can do the compare. we find 785 mismatched lines between the 2 different versions, roughly 11% of the 7000+ lines in the file, which is quite typical. you can see the book, with mismatches color-highlighted, here: > http://z-m-l.com/go/wmule/wmule-785.html i haven't bothered turning this into a "form", so that people can click radio-buttons to signify which of the two choices is correct, with a companion program noting when there was a consensus, but i think you can see that it would be quite elementary to do... by duplicating this procedure over two or more people, we could avoid the errors that can result if one person did the merge alone, especially whenever that person is a big bad bungler like myself... i will, of course, do the merge myself. but perhaps jose menendez will repeat the task after me, just to find all the errors that i missed. *** also of interest, generally, is the programming of a routine that will do the merge task for us. in many cases, it's a fairly straightforward operation that doesn't really require human interpretation. honest. for instance, one obvious rule would be that the routine will choose the line which will pass spellcheck instead of the line that does not... to illustrate this, we'll look at 4 of the first 5 mismatches we found: 001 --> sharply away from a hysterically clanging 001 --> sharply away from a hysterically danging 001 --> =================================^^------ 002 --> inches to spare and was halfway down the block 002 --> inches to spare and was halfway down the blodc 002 --> ============================================^^ 003 --> accepting a pale, expensive ticket which he 003 --> accepting a pale, expensive ticket whicjh he 003 --> =======================================^---- 005 --> Casey declaimed hotly. "I never was asked 005 --> Casey dedaimed hotly. "I never was asked 005 --> ========^^------------------------------- (to use the changeline best, view these with a monospace font.) in the first, the choice is "clanging" versus "danging". in the second, the choice is "block" versus "blodc". in the third, the choice is "which" versus "whicjh". and in the last, it's "declaimed" versus "dedaimed". obviously, we don't need a human to make the decision on these, since the computer can check the dictionary by itself without help. see if you can come up with any more rules that our routine can use to make the right choice on these mismatches without bothering us. -bowerbird p.s. wouldn't you know it, but "danging" _is_ in my dictionary. so, i have now put it on my list of words to be considered for deletion, along with "wax" (which is a stealth scanno for "was") and others... ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Nov 24 14:30:38 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 24 Nov 2008 17:30:38 EST Subject: [gutvol-d] trail of the white mule -- 005 Message-ID: we're checking the reposted "trail of the white mule", #2063. *** before we move on to the specific comparison data results, a short philosophical ramble. again in this reposting, as in other repostings, the _italics_ continue to be represented by an all-uppercase rendering. i know the historical reason. but it seems to me that part of the promise implied by "reposting" is "bringing books up to today's standards". look around. it's not 1978. or 1988. it's not even 1998. it's 2008, and the web has supported italics for 10+ years. there are about 60 cases of italics in this book, and they _should_ be represented by _underlines_ (in the text file), not uppercase. and they _should_ be italics in the .html... (isn't that part of what it means to have an .html version?) now personally, i do count these 60 instances as "errors". you can if you want, or you don't have to. but the point is, they need to be fixed, whether you call 'em errors or not... enough said. -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From schultzk at uni-trier.de Tue Nov 25 02:54:56 2008 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Tue, 25 Nov 2008 11:54:56 +0100 Subject: [gutvol-d] trail of the white mule -- 005 In-Reply-To: References: Message-ID: <200B2461-E1B7-4CAA-975F-7282FBF2B091@uni-trier.de> Hi BB,, The problem of STANDARDS, has been thoughly, discussed. I believe we should adhere to the standard Or change it. (No, not the web standard. ) Changing the standard probably, as you well know, will not happen. regards Keith. Am 24.11.2008 um 23:30 schrieb Bowerbird at aol.com: > we're checking the reposted "trail of the white mule", #2063. > > *** > > before we move on to the specific comparison data results, > a short philosophical ramble. > > again in this reposting, as in other repostings, the _italics_ > continue to be represented by an all-uppercase rendering. > > i know the historical reason. > > but it seems to me that part of the promise implied by > "reposting" is "bringing books up to today's standards". > > look around. it's not 1978. or 1988. it's not even 1998. > it's 2008, and the web has supported italics for 10+ years. > > there are about 60 cases of italics in this book, and they > _should_ be represented by _underlines_ (in the text file), > not uppercase. and they _should_ be italics in the .html... > (isn't that part of what it means to have an .html version?) > > now personally, i do count these 60 instances as "errors". > you can if you want, or you don't have to. but the point is, > they need to be fixed, whether you call 'em errors or not... > > enough said. > > -bowerbird > > > > ************** > One site has it all. Your email accounts, your social networks, and > the things you love. Try the new AOL.com today!(http:// > pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol? > redir=http://www.aol.com/?optin=new-dp%26icid=aolcom40vanity% > 26ncid=emlcntaolcom00000001) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 25 13:09:51 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 25 Nov 2008 16:09:51 EST Subject: [gutvol-d] trail of the white mule -- 006 Message-ID: we're checking the reposted "trail of the white mule", #2063. *** but first, one more meta-comment... at one point, al responded that he didn't have enough time to do the kind of analysis that i was doing to find errors that he hadn't found. and i replied that i understood. for me, it's all cost/benefit. what i didn't remark on then -- and will now -- is the curious fact that al didn't even ask me how much time i was spending. he also didn't tell us how much time _he_ spends on a repost. and he didn't tell us anything about his reposting workflow... now, i can tell you that i am indeed spending some time on this research that i'm doing, not an immense amount of time, but a substantial amount. however, the vast bulk of that time is in the programming of tools that help me do the job faster. tool-programming is a one-time cost, with benefits that repeat the more you use the tool, which is why it's a smart path to take. especially if you intend to use the tool on millions of o.c.r. books. so, even with my tool-programming time figured into the mix, my workflow just might be more speedy that al's workflow now. but certainly when my tools are refined, my way will be faster... -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Nov 25 15:01:20 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 25 Nov 2008 18:01:20 EST Subject: [gutvol-d] trail of the white mule -- 007 Message-ID: we're checking the reposted "trail of the white mule", #2063. *** ok, on to the data... this is just the first pass, so it might be kinda raw. but i show some 80+ differences between my _merged_ text and the reposted book. some of those could be errors on my part -- i make them! -- but i'm reasonably confident that the vast majority are errors in the repost. i'd guess it'd be 60 easy, maybe 70, or even 80. and there may be more errors i missed; you'll have to ask jose. i won't have time to _verify_ these against their scans _again_ before the thanksgiving holiday, but if anyone -- like jose -- wants to do it for me, that would be wonderful. let me know, and i'll post the u.r.l. for each mismatch, for your convenience. as a summary before then, i'll just say that 60-70 errors means the whitewashers should evaluate their need for improved tools. have a nice day... -bowerbird p.s. top line is the reposted text, bottom line is my text: > Casey bought it just to show who was boss,- > Casey bought it just to show who was boss -- > Casey would go when he, got good and ready. > Casey would go when he got good and ready. > ny good whatever. Sometimes they were re- > any good whatever. Sometimes they were re- > be cranked, Casey was busy gathering brush, > be cranked, Casey was busy gathering brush > for his supper fire when Fate came walking up' > for his supper fire when Fate came walking up > tell--" > tell." > rock, from camp, when the thin, unmistakable > rock from camp, when the thin, unmistakable > thick groves of pinon cedar and juniper trees > thick groves of pinon, cedar and juniper trees > the arc of his vision was a ledge, > the arc of his vision was a huge, granite ledge > cabin squatted secretively. One small window, > cabin squatted secretively. One small window > dread took hold of him, and grew while he > dread took hold of him and grew while he > Against the limitations proscribed by his ma- > Against the limitations prescribed by his ma- > concealment of its branches, he surveyed his > concealment of its branches he surveyed his > "I'd fix 'im, here an', now," threatened Paw, > "I'd fix 'im here an' now," threatened Paw, > your board -- c'm on an' I'll show yuh how." > your board -- c'mon an' I'll show yuh how." > wait awhile and see, why these miners found > wait awhile and see why these miners found > "More likely 'White Mule.'" Casey cocked > "More likely 'White Mule'." Casey cocked > they was up here huntin' burros an I caught yuh > they was up here huntin' burros an' caught yuh > and Joe -- outlaws all, he would have sworn- > and Joe -- outlaws all, he would have sworn -- > the dugout, and know that he was permitted to > the dugout, and knew that he was permitted to > shots and could tell, almost to an inch what > shots and could tell almost to an inch what > quick, as fiction would have them, but if his aim > quick as fiction would have them, but if his aim > anything like 'that, you can trust Casey Ryan > anything like that, you can trust Casey Ryan > year's to burn. Tell Mart the hounds of hell > years to burn. Tell Mart the hounds of hell > grin. "Me, I never went lookin' fer nothin, > grin. "Me, I never went lookin' fer nothin' > kick'to it?" > kick to it?" > table his good right hand supporting his left > table, his good right hand supporting his left > elbow outside the sling. He grinned at Joe > elbow outside the sling. He grinned at Joe; > back that, ain't quite so kicky. Been agin' it > back that ain't quite so kicky. Been agin' it > Paw accepted this remark, as high praise, > Paw accepted this remark as high praise, > know your pardner, BARNEY OAKES? > know your pardner, BARNEY OAKES?" > "Ah-h -- I know yuh think I don't? I > "Ah-h -- I know yuh -- think I don't? I > cor'ner-but he won't set on Casey Ryan's > cor'ner -- but he won't set on Casey Ryan's > asked moving toward him. "Is she here?" > asked, moving toward him. "Is she here?" > said gruffly. "It IS kinda -- pitiful. Thinks > said gruffly. "It's?? kinda -- pitiful. Thinks > hard times, for a miner, who ships no ore. > hard times, for a miner who ships no ore. > I told him held have to go when his month is > I told him he'd have to go when his month is > Ryan he won't go! Who'd, they think's runnin' > Ryan he won't go! Who'd they think's runnin' > shoulder blades, and awoke to the fact that he > shoulder blades and awoke to the fact that he > street cars running back to town all the time I > street cars running back to town all the time, > rat poison. I've got no use for the clowns- > rat poison. I've got no use for the clowns -- > jack. I'm plumb fed upon them pardnerships. > jack. I'm plumb fed up on them pardnerships. > "Fer, as I'm concerned, Casey's never backed > "Fer as I'm concerned, Casey's never backed > an I let fly an' it landed on a lady; an' the > an' let fly an' it landed on a lady; an' the > missus went an' bought her a new hat an took > missus went an' bought her a new hat an' took > the hills prospectin, or somethin', that roll uh > the hills prospectin' or somethin', that roll uh > He paused; and when he, spoke again his tone > He paused; and when he spoke again his tone > spent four hours on a hill once, out-settin, a > spent four hours on a hill once, out-settin' a > merciful as, it can afford to be, and I've got a > merciful as it can afford to be, and I've got a > bear, an' let yuh go on an make a profit so > bear, an' let yuh go on an' make a profit so > The muscles, along Casey's jaw had hardened > The muscles along Casey's jaw had hardened > it in the Ford. Until he did know, he was harm- > fit in the Ford. Until he did know, he was harm- > look his way. "Thought I left you takin, a > look his way. "Thought I left you takin' a > How Is that for guesswork?" > How's?? that for guesswork!" > mind reading an' forecastin' your horrorscope > mind readin' an' forecastin' your horrorscope > Casey grunted. "'Chump' is right, mebby. > Casey grunted. "Chump is right, mebby. > 'im. I told im I would." > 'im. I told 'im I would." > roundabout trail with a few tire tracks to > a roundabout trail with a few tire tracks to > glance and nodded approval as ?? ?? ?? > glance and nodded approval as he drove up > and, as a secondary consideration other crooks > and, as a secondary consideration, other crooks > cellar. Nolan was pleased; too, when Casey > cellar. Nolan was pleased, too, when Casey > Mack Nolan's eyes narrowed. "I think > Mack Nolan's eyes narrowed. "I think, > range, and the little black buttes standing afar, > range, and the little black buttes standing afar > "The thing's deeper than it looked, yester- > "The thing's deeper than it looked yester- > 'em yes. An' I'll say there was a bunch of 'em > 'em, yes. An' I'll say there was a bunch of 'em, > "You could go out and highjack someone." > "You could go out and highjack someone," > "If you can bring back a load of moonshine > "If you can bring back a load of moonshine, > he's on the trail an' travelin, has yet t' be > he's on the trail an' travelin' has yet t' be > be secret -- Mr. Nolan, you 'was talkin' t' > be secret -- Mr. Nolan, you was talkin' t' > put a time limit me, Mr. Nolan, an' nobody > put a time limit on me, Mr. Nolan, an' nobody > darn' good Ford yuh got! I was follered, and' > darn' good Ford yuh got! I was follered, and > I was follered hard. But I'm here an' they' > I was follered hard. But I'm here an' they > knowed who it was at all, they'd know I bel- > knowed who it was at all, they'd know I be- > kinda tough to break out with stealin I what yuh > kinda tough to break out with stealin' what yuh > miracles. While he did not, literally change > miracles. While he did not literally change > of lying deliberately to the Little Woman,- > of lying deliberately to the Little Woman, -- > Nolan noticed his silence, he gave no sign. > Nolan noticed his silence he gave no sign. > On the day when his time limit expired > On the day when his time limit expired, > strained, between them for the rest of that day. > strained between them for the rest of that day. > smoothly. By the time Casey spread his bed- > smoothly. By the time Casey spread his bed -- > Jim Cassidy came furtively over and settle > Jim Cassidy came furtively over and settled > Mack Nolan and lick the livin' tar wit of him > Mack Nolan and lick the livin' tar out of him > her grinning sheepishly. > her, grinning sheepishly. > "Naw. This ain't no trouble," he granted. > "Naw. This ain't no trouble," he grunted. ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Nov 26 02:32:12 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 26 Nov 2008 05:32:12 EST Subject: [gutvol-d] tofu turkey time Message-ID: as many of you probably know, david rothman has been dangling the idea of the imminent availability of _cheap_ e-book-machines -- as in $50 cheap -- for years now... it's always been a ridiculous claim, and i called him on it constantly, but he just kept repeating it and repeating it. it's ridiculous because an e-book-machine is a computer -- requiring the most expensive parts of any computer, namely a chip and a screen -- thus a $50 computer is something that's not gonna happen in the near future... indeed, with the value of a dollar falling year-over-year, there might _never_ be a $50 computer. sorry, charlie... finally, 3 years ago now, right around thanksgiving time, i posted a comment on rothman's teleblawg that if there was a $50 computer being sold within 5 years from then, i'd _buy_ him one. or else a turkey, as it was thanksgiving. he responded to "check back in a year and see who's right". > http://www.teleread.org/blog/?p=3911 so, every year about this time, i do a routine check back in. :+) 2006 came and went, and i was right. 2007, and i was right. and, surprise, surprise, what do you know?, but here in 2008, there's still no $50 computer on the market... none in sight... so, after checking back for 3 straight years, i'm _still_ right... david's crystal ball is severely cracked. *** we do have some interesting data-points in 2008, however... first, there _are_ e-book reader-machines, as you all know, including the kindle and the sony machine. about $300 each. that's a long way from 50 bucks. quite a long way, to be exact. and since capitalists now know they _can_ demand this price, anyone who knows how they operate will know that they _will_. they always charge the highest price the marketplace will bear. second, some people -- like me -- bought an o.l.p.c. machine. originally pitched as "the $100 computer", but the cost is $200... and you have to buy two of 'em, one of which is donated to a kid, so your total bill is $400. but it's for a good cause, so go buy it... > http://www.amazon.com/olpc third is the iphone, now beginning to look like a real computer. it's $200, providing you also buy a $50/month cell-phone plan. i'd call this the leading contender (although amazon's kindle has the advantage of its content). but a mile way from $50 one-time. my estimate that it would take 5 years to hit david's price-point of $50 is looking to be more and more accurate as the years roll by. meanwhile, david still spins his hype about a low-cost machine. he's bumped the price up to $100 now, so his crystal ball won't appear to be quite as cracked, but he's still off by a long shot... some people never learn... doing more damage than good, that's the david rothman way. but hey, let's give thanks that he's out of the hospital and sitting in his la-z-boy recliner reading e-books... -bowerbird ************** One site has it all. Your email accounts, your social networks, and the things you love. Try the new AOL.com today!(http://pr.atwola.com/promoclk/100000075x1212962939x1200825291/aol?redir=http://www.aol.com/?optin=new-dp %26icid=aolcom40vanity%26ncid=emlcntaolcom00000001) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Nov 26 23:36:51 2008 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 27 Nov 2008 02:36:51 EST Subject: [gutvol-d] trail of the white mule -- 008 Message-ID: we're checking the reposted "trail of the white mule", #2063. i wanna finish this up before my holiday, but ya'll can certainly wait until after _your_ holiday to read it, that's fine with me... *** i've verified all the differences i found between the versions. and it looks like we've got around 100 errors in this "repost". the list is appended. it also includes 17 cases of words that start with "any" or "some" or "every" -- like "anyone" -- which, in this book, were printed like "any one", i.e., as two words... once we add in the 60+ "errors" due to improper italicization, plus the 4 paragraphing errors listed below, we end up with a very large number of errors -- closing in on a total of 186. in a book with 278 pages, that's a pretty high ratio of errors. there's no sense dwelling on them, since the whitewashers don't care, but it seems to me they need to do a better job... oh well... my clean file is here: > http://z-m-l.com/go/wmule/wmule.zml and the book is updated: > http://z-m-l.com/go/wmule/wmulep123.html enjoy your thanksgiving dinner, and the nice long weekend... -bowerbird here are the 4 paragraphing errors: > He led it conquered back to the Ford, tied it > http://z-m-l.com/go/wmule/wmulep028.html > The old woman dropped her hands to her > http://z-m-l.com/go/wmule/wmulep083.html > Through his slits of swollen lids Casey glared > http://z-m-l.com/go/wmule/wmulep110.html > Miles away to the south, pale violet, dream- > http://z-m-l.com/go/wmule/wmulep214.html here are the 100+ other errors: 001 --> Casey Ryan to take orders from any one, espe~ 001 --> Casey Ryan to take orders from anyone, espe~ 001 --> ==================================^---------- 001 --> http://z-m-l.com/go/wmule/wmulep007.html 002 --> Casey bought it just to show who was boss, 002 --> Casey bought it just to show who was boss -- 002 --> =========================================^^^ 002 --> http://z-m-l.com/go/wmule/wmulep013.html 003 --> came. . . . 003 --> came... 003 --> =====^=^^^- 003 --> http://z-m-l.com/go/wmule/wmulep016.html 004 --> Casey would go when he, got good and ready. 004 --> Casey would go when he got good and ready. 004 --> ======================^-------------------- 004 --> http://z-m-l.com/go/wmule/wmulep023.html 005 --> ny good whatever. Sometimes they were re~ 005 --> any good whatever. Sometimes they were re~ 005 --> ^----------------------------------------- 005 --> http://z-m-l.com/go/wmule/wmulep026.html 006 --> be cranked, Casey was busy gathering brush, 006 --> be cranked, Casey was busy gathering brush 006 --> ==========================================^ 006 --> http://z-m-l.com/go/wmule/wmulep026.html 007 --> for his supper fire when Fate came walking up' 007 --> for his supper fire when Fate came walking up 007 --> =============================================^ 007 --> http://z-m-l.com/go/wmule/wmulep026.html 008 --> tell--" 008 --> tell." 008 --> ====^^- 008 --> http://z-m-l.com/go/wmule/wmulep034.html 009 --> rock, from camp, when the thin, unmistakable 009 --> rock from camp, when the thin, unmistakable 009 --> ====^--------------------------------------- 009 --> http://z-m-l.com/go/wmule/wmulep037.html 010 --> thick groves of pinon cedar and juniper trees 010 --> thick groves of pinon, cedar and juniper trees 010 --> =====================^------------------------ 010 --> http://z-m-l.com/go/wmule/wmulep039.html 011 --> teau not half so barren as the lower country. 011 --> teau, not half so barren as the lower country. 011 --> ====^----------------------------------------- 011 --> http://z-m-l.com/go/wmule/wmulep040.html 012 --> the arc of his vision was a ledge, 012 --> the arc of his vision was a huge, granite ledge 012 --> ============================^^^^^^^^^^^^^^^^^^^ 012 --> http://z-m-l.com/go/wmule/wmulep040.html 013 --> cabin squatted secretively. One small window, 013 --> cabin squatted secretively. One small window 013 --> ============================================^ 013 --> http://z-m-l.com/go/wmule/wmulep042.html 014 --> dread took hold of him, and grew while he 014 --> dread took hold of him and grew while he 014 --> ======================^------------------ 014 --> http://z-m-l.com/go/wmule/wmulep043.html 015 --> Against the limitations proscribed by his ma 015 --> Against the limitations prescribed by his ma~ 015 --> ==========================^=================^ 015 --> http://z-m-l.com/go/wmule/wmulep049.html 016 --> concealment of its branches, he surveyed his 016 --> concealment of its branches he surveyed his 016 --> ===========================^---------------- 016 --> http://z-m-l.com/go/wmule/wmulep051.html 017 --> ple. His hands relaxed and fall away from the 017 --> ple. His hands relaxed and fell away from the 017 --> ============================^---------------- 017 --> http://z-m-l.com/go/wmule/wmulep052.html 018 --> "I'd fix 'im, here an', now," threatened Paw, 018 --> "I'd fix 'im here an' now," threatened Paw, 018 --> ============^^^^^^^^^^^---------------------- 018 --> http://z-m-l.com/go/wmule/wmulep054.html 019 --> your board -- c'm on an' I'll show yuh how." 019 --> your board -- c'mon an' I'll show yuh how." 019 --> =================^-------------------------- 019 --> http://z-m-l.com/go/wmule/wmulep056.html 020 --> wait awhile and see, why these miners found 020 --> wait awhile and see why these miners found 020 --> ===================^----------------------- 020 --> http://z-m-l.com/go/wmule/wmulep056.html 021 --> "More likely 'White Mule.'" Casey cocked 021 --> "More likely 'White Mule'." Casey cocked 021 --> ========================^^-------------- 021 --> http://z-m-l.com/go/wmule/wmulep059.html 022 --> picious. 022 --> picious." 022 --> ========^ 022 --> http://z-m-l.com/go/wmule/wmulep060.html 023 --> "Guess I got yourn," Hank leered "when 023 --> "Guess I got yourn," Hank leered, "when I 023 --> ================================^^^^^^^^^ 023 --> http://z-m-l.com/go/wmule/wmulep061.html 024 --> "If any one's 'been usin' a high-power it 024 --> "If anyone's been usin' a high-power it 024 --> =======^^^^^^^^-------------------------- 024 --> http://z-m-l.com/go/wmule/wmulep061.html 025 --> they was up here huntin' burros an I caught yuh 025 --> they was up here huntin' burros an' caught yuh 025 --> ==================================^^----------- 025 --> http://z-m-l.com/go/wmule/wmulep061.html 026 --> and Joe -- outlaws all, he would have sworn 026 --> and Joe -- outlaws all, he would have sworn -- 026 --> ===========================================^^^ 026 --> http://z-m-l.com/go/wmule/wmulep062.html 027 --> stance. 027 --> stance." 027 --> =======^ 027 --> http://z-m-l.com/go/wmule/wmulep062.html 028 --> the dugout, and know that he was permitted to 028 --> the dugout, and knew that he was permitted to 028 --> ==================^-------------------------- 028 --> http://z-m-l.com/go/wmule/wmulep068.html 029 --> shots and could tell, almost to an inch what 029 --> shots and could tell almost to an inch what 029 --> ====================^----------------------- 029 --> http://z-m-l.com/go/wmule/wmulep069.html 030 --> two, capped five inches of fuse for each piece 030 --> two, capped five inches of fuse for each piece -- 030 --> ==============================================^^^ 030 --> http://z-m-l.com/go/wmule/wmulep075.html 031 --> of fuse protruded from the end of the half 031 --> of fuse protruded from the end of the half- 031 --> ==========================================^ 031 --> http://z-m-l.com/go/wmule/wmulep075.html 032 --> quick, as fiction would have them, but if his aim 032 --> quick as fiction would have them, but if his aim 032 --> =====^------------------------------------------- 032 --> http://z-m-l.com/go/wmule/wmulep076.html 033 --> he made a thick dough of leftover pancake bat 033 --> he made a thick dough of left-over pancake bat~ 033 --> =============================^^^^^^^^^^^^^^^^^^ 033 --> http://z-m-l.com/go/wmule/wmulep078.html 034 --> anything like 'that, you can trust Casey Ryan 034 --> anything like that, you can trust Casey Ryan 034 --> ==============^------------------------------ 034 --> http://z-m-l.com/go/wmule/wmulep083.html 035 --> year's to burn. Tell Mart the hounds of hell 035 --> years to burn. Tell Mart the hounds of hell 035 --> ====^--------------------------------------- 035 --> http://z-m-l.com/go/wmule/wmulep083.html 036 --> grin. "Me, I never went lookin' fer nothin, 036 --> grin. "Me, I never went lookin' fer nothin' 036 --> ==========================================^ 036 --> http://z-m-l.com/go/wmule/wmulep086.html 037 --> kick'to it?" 037 --> kick to it?" 037 --> ====^------- 037 --> http://z-m-l.com/go/wmule/wmulep088.html 038 --> table his good right hand supporting his left 038 --> table, his good right hand supporting his left 038 --> =====^---------------------------------------- 038 --> http://z-m-l.com/go/wmule/wmulep089.html 039 --> elbow outside the sling. He grinned at Joe 039 --> elbow outside the sling. He grinned at Joe; 039 --> ==========================================^ 039 --> http://z-m-l.com/go/wmule/wmulep089.html 040 --> cause every one was laughing and bending dou 040 --> cause everyone was laughing and bending dou~ 040 --> ===========^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 040 --> http://z-m-l.com/go/wmule/wmulep094.html 041 --> Every one, even Paw, who was normally a 041 --> Everyone, even Paw, who was normally a 041 --> =====^--------------------------------- 041 --> http://z-m-l.com/go/wmule/wmulep094.html 042 --> back that, ain't quite so kicky. Been agin' it 042 --> back that ain't quite so kicky. Been agin' it 042 --> =========^------------------------------------ 042 --> http://z-m-l.com/go/wmule/wmulep101.html 043 --> Paw accepted this remark, as high praise, 043 --> Paw accepted this remark as high praise, 043 --> ========================^---------------- 043 --> http://z-m-l.com/go/wmule/wmulep102.html 044 --> ture flying a storm flag, had any one with a 044 --> ture flying a storm flag, had anyone with a 044 --> =================================^---------- 044 --> http://z-m-l.com/go/wmule/wmulep104.html 045 --> testing 'squawk. "I brung 'em all the way 045 --> testing squawk. "I brung 'em all the way 045 --> ========^-------------------------------- 045 --> http://z-m-l.com/go/wmule/wmulep110.html 046 --> know your pardner, BARNEY OAKES? 046 --> know your pardner, BARNEY OAKES?" 046 --> ================================^ 046 --> http://z-m-l.com/go/wmule/wmulep110.html 047 --> "Ah-h -- I know yuh think I don't? I 047 --> "Ah-h -- I know yuh -- think I don't? I 047 --> ====================^^----------------- 047 --> http://z-m-l.com/go/wmule/wmulep110.html 048 --> cor'ner-but he won't set on Casey Ryan's 048 --> cor'ner -- but he won't set on Casey Ryan's 048 --> =======^^^^-------------------------------- 048 --> http://z-m-l.com/go/wmule/wmulep110.html 049 --> "Brung a cor'ner, did yuh, lookin' for some 049 --> "Brung a cor'ner, did yuh, lookin' for some~ 049 --> ===========================================^ 049 --> http://z-m-l.com/go/wmule/wmulep113.html 050 --> asked moving toward him. "Is she here?" 050 --> asked, moving toward him. "Is she here?" 050 --> =====^---------------------------------- 050 --> http://z-m-l.com/go/wmule/wmulep116.html 051 --> said gruffly. "It IS kinda -- pitiful. Thinks 051 --> said gruffly. "It's kinda -- pitiful. Thinks 051 --> =================^^^------------------------- 051 --> http://z-m-l.com/go/wmule/wmulep117.html 052 --> have speech with any one there, they would 052 --> have speech with anyone there, they would 052 --> ====================^--------------------- 052 --> http://z-m-l.com/go/wmule/wmulep118.html 053 --> hard times, for a miner, who ships no ore. 053 --> hard times, for a miner who ships no ore. 053 --> =======================^------------------ 053 --> http://z-m-l.com/go/wmule/wmulep120.html 054 --> I told him held have to go when his month is 054 --> I told him he'd have to go when his month is 054 --> =============^------------------------------ 054 --> http://z-m-l.com/go/wmule/wmulep124.html 055 --> Ryan he won't go! Who'd, they think's runnin' 055 --> Ryan he won't go! Who'd they think's runnin' 055 --> =======================^--------------------- 055 --> http://z-m-l.com/go/wmule/wmulep124.html 056 --> shoulder blades, and awoke to the fact that he 056 --> shoulder blades and awoke to the fact that he 056 --> ===============^------------------------------ 056 --> http://z-m-l.com/go/wmule/wmulep135.html 057 --> street cars running back to town all the time I 057 --> street cars running back to town all the time, 057 --> =============================================^^ 057 --> http://z-m-l.com/go/wmule/wmulep136.html 058 --> rat poison. I've got no use for the clowns 058 --> rat poison. I've got no use for the clowns -- 058 --> ==========================================^^^ 058 --> http://z-m-l.com/go/wmule/wmulep136.html 059 --> jack. I'm plumb fed upon them pardnerships. 059 --> jack. I'm plumb fed up on them pardnerships. 059 --> ======================^--------------------- 059 --> http://z-m-l.com/go/wmule/wmulep145.html 060 --> "Fer, as I'm concerned, Casey's never backed 060 --> "Fer as I'm concerned, Casey's never backed 060 --> ====^--------------------------------------- 060 --> http://z-m-l.com/go/wmule/wmulep146.html 061 --> an I let fly an' it landed on a lady; an' the 061 --> an' let fly an' it landed on a lady; an' the 061 --> ==^^----------------------------------------- 061 --> http://z-m-l.com/go/wmule/wmulep147.html 062 --> missus went an' bought her a new hat an took 062 --> missus went an' bought her a new hat an' took 062 --> =======================================^----- 062 --> http://z-m-l.com/go/wmule/wmulep147.html 063 --> city life for yuh!" 063 --> city life for yuh! 063 --> ==================^ 063 --> http://z-m-l.com/go/wmule/wmulep148.html 064 --> the hills prospectin, or somethin', that roll uh 064 --> the hills prospectin' or somethin', that roll uh 064 --> ====================^--------------------------- 064 --> http://z-m-l.com/go/wmule/wmulep148.html 065 --> down an' QUIT, by hock, and can be seen here~ 065 --> down an' QUIT, by heck, and can be seen here~ 065 --> ===================^------------------------- 065 --> http://z-m-l.com/go/wmule/wmulep149.html 066 --> He paused; and when he, spoke again his tone 066 --> He paused; and when he spoke again his tone 066 --> ======================^--------------------- 066 --> http://z-m-l.com/go/wmule/wmulep149.html 067 --> spent four hours on a hill once, out-settin, a 067 --> spent four hours on a hill once, out-settin' a 067 --> ===========================================^-- 067 --> http://z-m-l.com/go/wmule/wmulep154.html 068 --> merciful as, it can afford to be, and I've got a 068 --> merciful as it can afford to be, and I've got a 068 --> ===========^------------------------------------ 068 --> http://z-m-l.com/go/wmule/wmulep156.html 069 --> Why didn't yuh pick some one else for the 069 --> Why didn't yuh pick someone else for the 069 --> ========================^---------------- 069 --> http://z-m-l.com/go/wmule/wmulep160.html 070 --> bear, an' let yuh go on an make a profit so 070 --> bear, an' let yuh go on an' make a profit so 070 --> ==========================^----------------- 070 --> http://z-m-l.com/go/wmule/wmulep163.html 071 --> The muscles, along Casey's jaw had hardened 071 --> The muscles along Casey's jaw had hardened 071 --> ===========^------------------------------- 071 --> http://z-m-l.com/go/wmule/wmulep164.html 072 --> 'Here Is Casey Ryan -- a clown that's safe any 072 --> 'Here's Casey Ryan -- a clown that's safe any~ 072 --> =====^^^^^^^^^^^^^^^=^^^^^^^^^^^^^^^^^^^^^^^^^ 072 --> http://z-m-l.com/go/wmule/wmulep166.html 073 --> got a rep a mile long as a fightin', square~ 073 --> got a rep a mile long as a fightin', square- 073 --> ===========================================^ 073 --> http://z-m-l.com/go/wmule/wmulep166.html 074 --> look his way. "Thought I left you takin, a 074 --> look his way. "Thought I left you takin' a 074 --> =======================================^-- 074 --> http://z-m-l.com/go/wmule/wmulep188.html 075 --> comfortably in the car, his back against the bed 075 --> comfortably in the car, his back against the bed- 075 --> ================================================^ 075 --> http://z-m-l.com/go/wmule/wmulep191.html 076 --> plain to me before you knew there was any one 076 --> plain to me before you knew there was anyone 076 --> =========================================^--- 076 --> http://z-m-l.com/go/wmule/wmulep197.html 077 --> you've doped it out that you'll pack the bed 077 --> you've doped it out that you'll pack the bed- 077 --> ============================================^ 077 --> http://z-m-l.com/go/wmule/wmulep199.html 078 --> How Is that for guesswork?" 078 --> How's that for guesswork!" 078 --> ===^^^^^^^^^^^^^^^^=^^^^^^- 078 --> http://z-m-l.com/go/wmule/wmulep199.html 079 --> mind reading an' forecastin' your horrorscope 079 --> mind readin' an' forecastin' your horrorscope 079 --> ===========^--------------------------------- 079 --> http://z-m-l.com/go/wmule/wmulep199.html 080 --> Casey grunted. "'Chump' is right, mebby. 080 --> Casey grunted. "Chump is right, mebby. 080 --> ================^^^^^^^----------------- 080 --> http://z-m-l.com/go/wmule/wmulep200.html 081 --> 'im. I told im I would." 081 --> 'im. I told 'im I would." 081 --> ============^------------ 081 --> http://z-m-l.com/go/wmule/wmulep205.html 082 --> roundabout trail with a few tire tracks to 082 --> a roundabout trail with a few tire tracks to 082 --> ^^------------------------------------------ 082 --> http://z-m-l.com/go/wmule/wmulep206.html 083 --> way, unless it were definitely known that some 083 --> way, unless it were definitely known that some~ 083 --> ==============================================^ 083 --> http://z-m-l.com/go/wmule/wmulep206.html 084 --> glance and nodded approval as ((words) (missing) (here)) 084 --> glance and nodded approval as he drove up 084 --> ==============================^^^^=^^^=^^^^^^^^^^^^^^^^^ 084 --> http://z-m-l.com/go/wmule/wmulep207.html 085 --> Black Butte bunch, f'r instance. But if any 085 --> Black Butte bunch, f'r instance. But if any~ 085 --> ===========================================^ 085 --> http://z-m-l.com/go/wmule/wmulep207.html 086 --> light. There's not one chance in fifty that any 086 --> light. There's not one chance in fifty that any~ 086 --> ===============================================^ 086 --> http://z-m-l.com/go/wmule/wmulep209.html 087 --> and, as a secondary consideration other crooks 087 --> and, as a secondary consideration, other crooks 087 --> =================================^------------- 087 --> http://z-m-l.com/go/wmule/wmulep210.html 088 --> cellar. Nolan was pleased; too, when Casey 088 --> cellar. Nolan was pleased, too, when Casey 088 --> =========================^---------------- 088 --> http://z-m-l.com/go/wmule/wmulep211.html 089 --> Mack Nolan's eyes narrowed. "I think 089 --> Mack Nolan's eyes narrowed. "I think, 089 --> ====================================^ 089 --> http://z-m-l.com/go/wmule/wmulep212.html 090 --> glamour of unreality. The mountains beyond, 090 --> glamor of unreality. The mountains beyond, 090 --> =====^------------------------------------- 090 --> http://z-m-l.com/go/wmule/wmulep215.html 091 --> range, and the little black buttes standing afar, 091 --> range, and the little black buttes standing afar 091 --> ================================================^ 091 --> http://z-m-l.com/go/wmule/wmulep215.html 092 --> "The thing's deeper than it looked, yester 092 --> "The thing's deeper than it looked yester~ 092 --> ==================================^^^^^^^^ 092 --> http://z-m-l.com/go/wmule/wmulep217.html 093 --> 'em yes. An' I'll say there was a bunch of 'em 093 --> 'em, yes. An' I'll say there was a bunch of 'em, 093 --> ===^^^^^^^^^^^^^=^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 093 --> http://z-m-l.com/go/wmule/wmulep219.html 094 --> "You could go out and highjack some one." 094 --> "You could go out and highjack someone," 094 --> ===================================^^^^^- 094 --> http://z-m-l.com/go/wmule/wmulep221.html 095 --> "If you can bring back a load of moonshine 095 --> "If you can bring back a load of moonshine, 095 --> ==========================================^ 095 --> http://z-m-l.com/go/wmule/wmulep222.html 096 --> he's on the trail an' travelin, has yet t' be 096 --> he's on the trail an' travelin' has yet t' be 096 --> ==============================^-------------- 096 --> http://z-m-l.com/go/wmule/wmulep225.html 097 --> "I'll take it in," said Nolan. "If any one 097 --> "I'll take it in," said Nolan. "If anyone 097 --> ======================================^--- 097 --> http://z-m-l.com/go/wmule/wmulep225.html 098 --> be secret -- Mr. Nolan, you 'was talkin' t' 098 --> be secret -- Mr. Nolan, you was talkin' t' 098 --> ============================^-------------- 098 --> http://z-m-l.com/go/wmule/wmulep227.html 099 --> put a time limit me, Mr. Nolan, an' nobody 099 --> put a time limit on me, Mr. Nolan, an' nobody 099 --> =================^^-------------------------- 099 --> http://z-m-l.com/go/wmule/wmulep227.html 100 --> darn' good Ford yuh got! I was follered, and' 100 --> darn' good Ford yuh got! I was follered, and 100 --> ============================================^ 100 --> http://z-m-l.com/go/wmule/wmulep228.html 101 --> I was follered hard. But I'm here an' they' 101 --> I was follered hard. But I'm here an' they 101 --> ==========================================^ 101 --> http://z-m-l.com/go/wmule/wmulep228.html 102 --> hard on the knuckles--" He glanced down at 102 --> hard on the knuckles--" he glanced down at 102 --> ========================^----------------- 102 --> http://z-m-l.com/go/wmule/wmulep232.html 103 --> Casey's hands and grinned. "--I think it may 103 --> Casey's hands and grinned "--I think it may 103 --> =========================^------------------ 103 --> http://z-m-l.com/go/wmule/wmulep232.html 104 --> kinda tough to break out with stealin I what yuh 104 --> kinda tough to break out with stealin' what yuh 104 --> =====================================^^--------- 104 --> http://z-m-l.com/go/wmule/wmulep232.html 105 --> here while I'm gone. If any one shows up, 105 --> here while I'm gone. If anyone shows up, 105 --> ===========================^------------- 105 --> http://z-m-l.com/go/wmule/wmulep234.html 106 --> miracles. While he did not, literally change 106 --> miracles. While he did not literally change 106 --> ==========================^----------------- 106 --> http://z-m-l.com/go/wmule/wmulep236.html 107 --> of lying deliberately to the Little Woman, 107 --> of lying deliberately to the Little Woman, -- 107 --> ==========================================^^^ 107 --> http://z-m-l.com/go/wmule/wmulep238.html 108 --> Nolan noticed his silence, he gave no sign. 108 --> Nolan noticed his silence he gave no sign. 108 --> =========================^----------------- 108 --> http://z-m-l.com/go/wmule/wmulep238.html 109 --> On the day when his time limit expired 109 --> On the day when his time limit expired, 109 --> ======================================^ 109 --> http://z-m-l.com/go/wmule/wmulep244.html 110 --> strained, between them for the rest of that day. 110 --> strained between them for the rest of that day. 110 --> ========^--------------------------------------- 110 --> http://z-m-l.com/go/wmule/wmulep246.html 111 --> smoothly. By the time Casey spread his bed 111 --> smoothly. By the time Casey spread his bed -- 111 --> ==========================================^^^ 111 --> http://z-m-l.com/go/wmule/wmulep250.html 112 --> ing Lou gasped, 112 --> ing Lou gasped. 112 --> ==============^ 112 --> http://z-m-l.com/go/wmule/wmulep254.html 113 --> Jim Cassidy came furtively over and settle 113 --> Jim Cassidy came furtively over and settled 113 --> ==========================================^ 113 --> http://z-m-l.com/go/wmule/wmulep255.html 114 --> Mack Nolan and lick the livin' tar wit of him 114 --> Mack Nolan and lick the livin' tar out of him 114 --> ===================================^^-------- 114 --> http://z-m-l.com/go/wmule/wmulep267.html 115 --> fault was mine if any one's. I was too busy 115 --> fault was mine if anyone's. I was too busy 115 --> =====================^--------------------- 115 --> http://z-m-l.com/go/wmule/wmulep268.html 116 --> Cajon. The Little Woman peered into the rear~ 116 --> Cajon. The Little Woman peered into the rear- 116 --> ============================================^ 116 --> http://z-m-l.com/go/wmule/wmulep271.html 117 --> her grinning sheepishly. 117 --> her, grinning sheepishly. 117 --> ===^--------------------- 117 --> http://z-m-l.com/go/wmule/wmulep275.html 118 --> "Naw. This ain't no trouble," he granted. 118 --> "Naw. This ain't no trouble," he grunted. 118 --> ===================================^----- 118 --> http://z-m-l.com/go/wmule/wmulep275.html ************** Life should be easier. So should your homepage. Try the NEW AOL.com. (http://www.aol.com/?optin=new-dp&icid=aolcom40vanity& ncid=emlcntaolcom00000002) -------------- next part -------------- An HTML attachment was scrubbed... URL: