From Bowerbird at aol.com Thu Jan 1 12:47:33 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 1 Jan 2009 15:47:33 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-01 -- happy new year Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're looking at a book called "the hound of the north". *** remember my "how to clean up o.c.r." series last july? that's right, it was 6 whole months ago -- time flies when you're having fun -- so it's time for the sequel! because, frankly, the d.p. people haven't learned yet. *** welcome! happy new year! we'll be cleaning the o.c.r. from "the hound of the north". the book was split into 3 parts at distributed proofreaders. > http://www.pgdp.net/c/project.php?id=projectID494113af10e06 > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 > http://www.pgdp.net/c/project.php?id=projectID494113f2b0f5f the project manager for this book was roger frank... rfrank also did the scanning, and the preprocessing, and will be doing the post-processing too, as usual. he's a regular digitization machine, he is! we'll examine the second part first. (i'm not sure why.) *** due to many shortcomings in the clean-up of this o.c.r., i'll have to spend much time doing constructive criticism. so let's start off by noting something that was done right. the scans for this book -- and this is typical of rfrank -- are very nicely done. they look good. not only were they carefully scanned, but they have been rotated to straight, and cropped nicely. so they give a very nice appearance... this is not a trivial matter. even though the scans will be largely unused once we've digitized the text from them, the digitization process benefits greatly from clean scans. it's extremely hard on the eye and the mind to work with bad scans, and that difficulty can cause quick burn-out. nice scans, on the other hand, are a real joy to work with. it's well worth the time to take the time to make them nice. *** within a d.p. o.c.r. file, you will see separators like this: > -----File: 118.png------------------------------- > -----File: 119.png------------------------------- > -----File: 120.png------------------------------- these lines separate the o.c.r. output from each scan, and tell you the _name_ of the scan that produced that output. unfortunately, rfrank (like many other "content providers" over at distributed proofreaders) doesn't name these files wisely, so the _page-number_ of each of these files is often _not_ reflected in the filename. (rather, the filename shows the _sequential_ number of the files as they were scanned.) this makes it unnecessarily difficult to deal with the pages in the process of cleaning up the text, because we have to negotiate two different, incompatible numbering schemes. the solution is to rename the files with their page-number. in this book, for part 2, the page-number was an offset of -6, so rfrank's files 118-234 were renamed to 112-228... of course, then you also need to change the references in the separators, so they will sync up with the new filenames. of course, if you rename the files _before_ you do the o.c.r., the number/names shown in the o.c.r. file will be accurate, so you won't need to change references in the separators... but also, of course, if you _scan_ the pages such that their sequence-number _matches_ the page-number -- which essentially means you restart your sequence at page 1 -- you don't have to do any renaming. that's the smartest... *** in addition, that d.p. separator -- with all of its dashes -- is a particularly unwise choice, since many of the routines that we'll be developing are designed to search for dashes -- particularly those falling at the start or end of a line -- and these dashed separators are just a stupid distraction... so i change the separators to z.m.l. page-information style. that means that the 3 lines shown above are converted to: > {{houndp112.png}} || hound of the north || > {{houndp113.png}} || hound of the north || > {{houndp114.png}} || hound of the north || first off, i use the {{}} bracketing to enclose the scan-name, having applied the offset of -6. i also prepend the "bname" -- "bookname" -- my 5-character label that defines the book. for this book, the bname is "hound". the "p" after the bname indicates that this is a page within the body-text of the book. and "hound of the north" surrounded by || is the run-head. once i've applied z.m.l.-style headers, i can create the .html that allows me to upload the book, so _you_ can see it too... once you know the "bname" for one of my books, you know _where_ to find it online, using the following skeleton guide: > http://z-m-l.com/go/bname so for this book -- with its bname of "hound" -- you'd find it at: > http://z-m-l.com/go/hound each file in that "hound" subdirectory also starts with "hound". so, for instance, the image for _page_ 123 is here: > http://z-m-l.com/go/hound/houndp123.png an .html page that shows the text and the image for page 123 is: > http://z-m-l.com/go/hound/houndp123.html if you go to that u.r.l., and thumb through the pages, you'll see how nice rfrank's scans look. it's a pleasure to work with them. so that's today's lesson: 2009-01-01. rename files and change separators, if necessary. see you tomorrow! happy new year! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ag737 at ncf.ca Fri Jan 2 11:25:11 2009 From: ag737 at ncf.ca (Wallace J.McLean) Date: Fri, 02 Jan 2009 14:25:11 -0500 Subject: [gutvol-d] Public Domain Day 2009 Message-ID: <140ea413cf7f.13cf7f140ea4@ncf.ca> Happy Public Domain Day!* http://www.xanga.com/publicdomain/687985634/public-domain-day-2009.html * A little bug prevented me from posting this to the list yesterday. From Bowerbird at aol.com Fri Jan 2 13:34:36 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 2 Jan 2009 16:34:36 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-02 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-02. fix the paragraphing, including the terminators... many of the routines we'll use are based on _paragraphs_, so we have to ensure that glitches in the o.c.r. paragraphing are fixed... also, in regard to distributed proofreaders, if the paragraphing is correct before you send the file to the p1 proofers, then it's easier to do a comparison of the output across the rounds, because you can use a line-based comparison methodology without the need to worry about problems with line-sync between the various files. *** the first general type of paragraphing error that o.c.r. makes is the introduction of a blank line into the middle of a paragraph... the easiest check for this is to see whether a line which does _not_ end with a paragraph-terminator is followed by a blank line, which is -- in turn -- followed by a line that starts with a lowercase letter. we have a bunch of these cases -- 26 -- in this file: > i > all she could say. > he fingered his watch from force of habit. > followed an excited murmur. "What's Peter going > too late. They had seen the blood upon his hand. > face was blanched, but her mouth was tightly clenched. > prostrate man had vanished; a world of pity was in > her eyes as she silently looked on. > violent fit of coughing seized the dying man, then it > t > hand, "another mile of this d------d valley and I > should have turned tail and fled back to the open. > amount that Hervey promptly decided to double the > muttered. "?I suppose he thinks | am blind. Well, > before------" She broke off, only to resume again with > a fierce and passionate earnestness of which Alice > may love me now; | believe | love him, but------No, > snarl at, you d-----d cur," Hervey said, between his > clenched teeth. Then he turned at the sound of his > about him curiously. > o > for further demonstration. The prolonged screech of > p > gazed appealingly into his face. And the man had > and then------No, mother and | will see the matter > through. We have already secured the services of in general, you can delete these blank lines "blindly". *** another type of paragraphing error is the _absence_ of a capital letter at the start of a line on a new paragraph. so, you search for a line that's paragraph-terminated which is followed by a blank line that is -- in turn -- followed by a line that begins with a lowercase letter. there were no cases of this bug in the file. *** a common problem with paragraph-termination involves exclamation-points being misrecognized as the number 1. a search for the number 1 brings up 7 cases: > "Ther1 ain't no one aboard of that sleigh," he called > her brain in an unchecked torrent It seen ^1 to > women that ?'$1 never let the child rest a minute." > afternoon's 'chicken shoot.1 He says the prairie-chicken > I'll join you directly. Where are you? In the wash-1 > in, both of you, or the ?' slap-jacks' ?'$1 all be spoiled." > Dead'ning a mind to lofty thought for which by nature meant.1* none of these cases are misrecognized exclamation-points, but all of 'em are errors which can (and should) be corrected. *** another common paragraphing problem in o.c.r. output is the "phantom" line, where a mark on a page is misrecognized as a one-letter or two-letter "word". a search for any line that consists of a one-letter-word or a two-letter-word finds these. there were 8 cases of these: > M > i > t > o > p > IT > ,' > ?" you'll notice that some of these lines were discovered earlier, when we searched for lowercase lines preceded by blank lines. there is redundancy in these checks, and that's a good thing... the last 2 cases listed above are ones where some punctuation from the end of the previous line was incorrectly moved down to the start of the next line; it's easy enough to move it back up. but it does show that it's a good idea to monitor these changes, rather than to have them applied blindly. (it wouldn't be the end of the world if these lines had been eliminated in a blind change; if 6 errors are fixed and 2 are not, that's not a bad performance, but since it's easy to monitor these changes, why not get 8 of 8?) *** one final routine, not entirely focused on paragraphing per se, but similar to the search above, and relevant in some situations: search, at the beginning of a line, for a one-character "word" which is not "a" (in uppercase or lowercase) or "i" (in uppercase). (i'm not fluent enough in reg-ex to say how to specify that search, but perhaps someone who is will share that information with us.) there were 8 cases of this particular anomaly in this file: > ? I'll tell you; you have brought her nothing > * By Jove I you are a good sort, George. If you > * How much will appease your creditors?" > u Sakes alive, girl, yes. It's the way you have said. > ? When shall you return?" > ' through mail' to the coast and have to make up > ' A life monotonous, unrelieved, breeds selfish discontent, > ' I'll close the window." Iredale moved across the most of these are cases where a double-quote at the beginning of the line was misrecognized -- either as an asterisk, or a "u", or as a backtick, or as a single-quote -- so it's good to find these glitches and correct them, as they do indeed bear on paragraphing. *** there's one more issue concerning paragraphing, but we've already covered more than enough today. i'll do that final issue tomorrow... *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... see you tomorrow! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbnewby at pglaf.org Fri Jan 2 15:13:16 2009 From: gbnewby at pglaf.org (Greg Newby) Date: Fri, 2 Jan 2009 15:13:16 -0800 Subject: [gutvol-d] Blog post of "thanks" Message-ID: <20090102231316.GD19567@mail.pglaf.org> I thought I'd share (and I already told the poster not to use promo.net/pg): http://blog.offbeattravel.com/?p=240 It's basically a "thank you" to PG. -- Greg From 1001 at atlanticbb.net Sat Jan 3 14:39:03 2009 From: 1001 at atlanticbb.net (1001 at atlanticbb.net) Date: Sat, 3 Jan 2009 17:39:03 -0500 Subject: [gutvol-d] Google books Message-ID: <009c01c96df4$182203e0$680fa8c0@atlanticbb.net> Any one know how google books works? There are lots of books which just comeup with titles and maybe a publisher. Don't look like they have been scanned or inventoried or anyghking like that. Arethese books that may appear one day? or are they just booksthey kno about and are going to ignore? and why do they have "no access" to so many books over 100 years old. nwolcott2 at post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sat Jan 3 19:17:38 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 3 Jan 2009 22:17:38 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-03 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-03. fix the paragraphing around the pagebreaks too. *** as it was yesterday, today's topic of concern is _paragraphing_... in particular, today's lesson is paragraphing at the top of a page. you might remember that when i did this series back in july 2008, this particular matter became quite contentious. that's why i have devoted an entire day to making this specific point -- once again. back then, i said roger frank (rfrank) had created new paragraphs whenever the first line on a page started with an uppercase letter. i said that -- although a good notion -- this isn't sophisticated enough, because it doesn't account for the mid-paragraph lines which start with a person's name. i noted you get better results if you bring into the equation the last line on the previous page. if that last line of the previous page was sentence-terminated -- i.e., ended with a period, exclamation-point, or question-mark, or any of those punctuation-marks followed by a double-quote -- _and_ the first line of the current page started with uppercase, then you can assume a new paragraph with much more certainty. (although, as we'll determine later, even this gives no guarantee; but lacking previous-page sentence-termination, no paragraph.) at that time, rfrank entered the thread, in a very unseemly manner. he insulted me, told me how he didn't value my opinion, on and on. (as if the long comprehensive lists of errors that i had been posting for months prior, and months since, are just a matter of "opinion".) more to the point, he challenged the veracity of what i had said... he told me that his program did indeed consider the last line of the previous page in making the decision about a new paragraph, and he curtly told me i could look at the code if i didn't believe him. being that i'm more concerned with the _output_ of that program, i went back to double-check the text-file to see if i'd been wrong... nope. my second analysis confirmed my first. it was unequivocal. _none_ of the data supported the interpretation rfrank had given... _every_single_case_ was _exactly_ how i'd characterized it earlier... every page starting with a capital was marked as a new paragraph, and no page that didn't start with a capital was. totally unequivocal. so i posted the data, publicly, so that everyone could see i was right, and that rfrank was the one guilty of making the "misstatement" that he had tried (without evidence) to pin on me. the facts are the facts... anyway, i woulda thought that that embarrassing escapade would've taught rfrank that he needed to get a grip, and rewrite his code too... *** but no... *** because, some 6 months later, i sadly must report an identical flaw... _this_ text has the exact same shortcomings that the earlier one had: a new paragraph is made if the first line on a page is capitalized, even when the last line on the previous page was not sentence-terminated. in other words, rfrank is still using the inferior routine he used before. the evidence, as before, is totally unequivocal... *** specifically, this has resulted in paragraphing errors on these 6 pages: > her brain in an unchecked torrent It seen ^1 to > -----File: 134.png------------------------------------------ > Prudence as though some barrier had suddenly shut > http://z-m-l.com/go/hound/houndp128.html > recognized the fact that Iredale was in love with > -----File: 152.png------------------------------------------ > Prudence, nor was he slow to appreciate the possl? > http://z-m-l.com/go/hound/houndp146.html > own interests, and he revelled in the thought of > -----File: 205.png------------------------------------------ > George Iredale's wealth. The despicable methods > http://z-m-l.com/go/hound/houndp199.html > things. Either he must renounce all thoughts of > -----File: 211.png------------------------------------------ > Prudence Mailing, or he must marry her, and break from > http://z-m-l.com/go/hound/houndp205.html > Iredale moved over to where Prudence was sitting > -----File: 220.png------------------------------------------ > She had ceased work to greet him, but she did not > http://z-m-l.com/go/hound/houndp214.html in the first 5 cases above, it's clear that the _paragraph_ just continued, as the _sentence_itself_ continues from the previous page to the current. in the sixth case, it's unclear, until we check the scans and observe that "was sitting" should likely be followed by a sentence-terminating period -- even though that period appears to be missing from the scan -- but that "she had ceased" continues the paragraph, doesn't start a new one, which is clear because its line is not indented. *** that sixth case from above leads us right to the next part of the lesson. even when you check for sentence-termination on the previous page, there _are_ cases where a paragraph continues in spite of the fact that the last line on the previous page was sentence-terminated and (thus) the first line on the current page is capitalized. however, you can test for it by considering the _length_ of the last line on the previous page; if that last line is long, there's a chance you don't have a new paragraph. that means you have to look at the scans where the last line is fairly long. specifically, you must look at the scan to see if there is a new paragraph; in order to determine it, of course, you see if that first line is _indented_. doing so would have prevented paragraphing mistakes on these 5 pages: > he had plainly shown what manner of man he was. > -----File: 151.png------------------------------------------ > Even the doting affection of his mother had not > http://z-m-l.com/go/hound/houndp145.html > IT > -----File: 184.png------------------------------------------ > "A light," sfee said. "?That must be the ranch. Quick, > http://z-m-l.com/go/hound/houndp178.html > snuffed at the projecting arm of a wooden cross. > -----File: 199.png------------------------------------------ > Then it drew back sharply with its little upstanding > http://z-m-l.com/go/hound/houndp193.html > and never had he felt so free from care as now. > -----File: 215.png------------------------------------------ > He realized all that a lover may realize of his own > http://z-m-l.com/go/hound/houndp209.html > long silence while the machine rattled down a seam. > -----File: 221.png------------------------------------------ > The man watched the nimble fingers intently as they > http://z-m-l.com/go/hound/houndp215.html checking any of those pages will show you the first line is _not_ indented; therefore, the new paragraph that rfrank had inserted is _incorrect_ there. *** finally, there's one case where rfrank _failed_ to find a new paragraph on a pagebreak, because a double-quote was misrecognized as an asterisk. > hand down heavily upon the table. > -----File: 159.png------------------------------------------ > * By Jove I you are a good sort, George. If you > http://z-m-l.com/go/hound/houndp215.html you will recall that we specifically looked for misrecognized double-quotes at the start of a line, yesterday, and this is precisely the reason we did that... (indeed, that's the reason i found that problem.) *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. see you tomorrow! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From debook2164 at hotmail.com Sat Jan 3 19:21:45 2009 From: debook2164 at hotmail.com (David Edwards) Date: Sat, 3 Jan 2009 21:21:45 -0600 Subject: [gutvol-d] Google books In-Reply-To: <009c01c96df4$182203e0$680fa8c0@atlanticbb.net> References: <009c01c96df4$182203e0$680fa8c0@atlanticbb.net> Message-ID: Some of the books you see listed that way are either still under copyright and the publishers stopped them from posting the books. An agreement was recently reached over that issue this last fall. Most all the rest that have limited or no view are books that someone has claimed some type of copyright to. Often illegally but Google honors anyone that is willing to pay them money to not allow full access to a Public Domain book. Googles purpose was never to benefit the public with full view scanned book. The idea was to sell ads. You can see a few choice pages that match your search, then you are expected to see the ads for those that have or may have the book to sell. The only ones that have full access are the universities and libraries that have allowed them to scan their books and they in-turn have been selling the rights to see the books. That too has been addressed in the settlement, the schools and libraries must now offer a "free" station to view them. A joke really as many have set up only one or at most two computers that are public that can access the books for free. It is the person with the money that talks when Google is listening. EMAILING FOR THE GREATER GOODJoin me From: 1001 at atlanticbb.netTo: gutvol-d at lists.pglaf.orgDate: Sat, 3 Jan 2009 17:39:03 -0500Subject: [gutvol-d] Google books Any one know how google books works? There are lots of books which just comeup with titles and maybe a publisher. Don't look like they have been scanned or inventoried or anyghking like that. Arethese books that may appear one day? or are they just booksthey kno about and are going to ignore? and why do they have "no access" to so many books over 100 years old. nwolcott2 at post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sun Jan 4 15:37:39 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sun, 4 Jan 2009 18:37:39 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-04 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-04. compile a list of the proper nouns in the book... *** you'll remember yesterday that i said there are _redundancies_ in the checks that we'll do. there are also _interdependencies_. the check we'll be doing tomorrow is for sentence-terminators. that's a fairly straightforward check, with two separate aspects... the check is for a non-sentence-terminated word _followed_by_ an uppercase word. this kind of glitch happens for two reasons. first, it can happen when o.c.r. misses the sentence-termination. a period, in particular, is small, and is often missed by the o.c.r. second, the glitch can also happen when a lowercase letter is misrecognized as an uppercase letter. this happens less often, but still can be surprisingly frequent, especially with the letter "i". there is a big wrinkle with this check, however, and that is with _proper_nouns_ -- like _names_ -- which are _also_ capitalized. since they often occur mid-sentence, they mess up this check... in order to control this wrinkle, we must compile a list of names used in the book, so we can ignore them when doing our check. ironically, one of the best ways to compile the list of names is to use this exact same check; that is, we compile all the words that are capitalized, _except_ when they follow sentence-termination. *** in other words, we run this routine, and sort output to two piles: one pile is the list of names; the other is glitches needing fixes... *** once i have this list of words, i throw out all of 'em that are _not_ included in my spell-check dictionary (from which i intentionally excluded proper-names), and that leaves me a pretty good list... i _will_ include a capitalized word, even if it _is_ in my dictionary, if it appears _only_ in capitalized form in the book, not lowercase. for instance, in this book, there was a character named "prudence". because that word didn't appear in a lowercase form, i included it. and there was another with the last name of "grey", also included. *** when i did this series last july, i included specific criteria as to how to specify each routine. i'm not doing that for this sequel, because i've decided i'll release an app that does these checks. this means you can just copy the text of a book to my program, and it will spit out all the names for you, simply as apple pie... but by all means, if someone can generate the reg-ex code to generate the list of names used in a book, feel free to share it. anyway, the list of names in this book is appended. you'll see the list includes the full name of a character, if that's available, and splits out first and last names separately, when available... the top of the appended list shows the full character names, while the bottom shows first and last names separately. you'll also notice that this list can highlight some _problems_. take a look through the full names and see if you spot them... did you? we have both "robb chillingwood" and "robe chillingwood", where "robe" is an obvious misrecognition. and we also see "peter furrer" and "peter furrers", inconsistent in the p-book. and there is a "miss covills" versus a "mrs. covill", but a check in the p-book shows the former references multiple misses... there are some other curious constructions as well, including: > And Alice -- (1) * > G-- The -- (1) > Get Prudence -- (1) > Grey.' Those -- (1) > Iredale, She -- (1) * > No, Alice -- (1) > Owl Hoot He -- (1) * > Owl Hoot How -- (1) * > Then Leslie -- (1) the ones with asterisks were glitches that needed to be fixed... finally, a more extensive routine turned up this anomaly: > Owl Hoot Valley -- (1) * > Owl Hoot valley -- (1) since the valley is mentioned many times, but only capitalized the one time, i considered this an error, and decapitalized it... *** the bottom part of the list of names appended to this message gives the mid-sentence capitalized words. this list also contains some curious constructions which give information about errors: > Iredalc* > Iredale > Raach* > Ranch the items with asterisks were errors that were corrected... there were also some questionable items, which were checked: > Customs > Fate > Nature all 3 of these items were found to be correct. another two cases of curious construction were checked: > Now she asked herself, What had she done? > http://z-m-l.com/go/hound/houndp171.html ... > and then -- No, mother and I will see the matter > http://z-m-l.com/go/hound/houndp222.html although one could certainly question these constructions, i decided to leave them as they were. *** last but not least, we have the case of "mailing versus malling". there is a family in this book who has the last name of "malling", but the o.c.r. recognized all 28 cases of their name as "mailing". the p1 proofers only discovered 10 of these misrecognitions... actually, it's probably more accurate in this case to think about the individual p1 proofers who worked on the book, since once a proofer realized the o.c.r. was making this particular mistake, that proofer would find it and fix it every single time it occurred. a proofer who never realized it would gloss over every instance. so here once again one of the biggest flaws in the d.p. workflow reveals itself. this is a mistake that could have been fixed fully, and immediately, with a global-change once it had been found. there's no good reason in the world to make the proofers find and fix each one of these instances individually and manually... and -- as we've seen here -- doing so just eats up resources... this text had to go through 2 rounds of proofing to get all of this one particular error cleaned up, when the error could've been fixed totally and completely with a single global change. moreover, sometimes the proofers fail to find all the instances. recall that in a book i checked recently, there were instances of an error in a name -- which i'd global-changed at the outset -- which persisted all the way through to p.g. _posting_ the book! they'd slipped through several rounds of proofing, formatting, plus the post-processing stage, and even their white-washing! when they _should've_ been globally fixed before going to p1... *** before we close up today's lesson, i should mention one thing that i haven't brought up so far, namely that this list of names will also be of great use to us later on, when we do spellcheck. spellcheck is something that you want to do _continuously_ on an e-text as you work with it, so that if you accidentally introduce an error, there is a chance that it will be detected. but the only way to avoid a ton of false-alarms in spellcheck is to have a good book-specific dictionary listing exceptions. this list serves that purpose, so it's tremendously useful to us. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... see you tomorrow! -bowerbird a list of 2-or-more consecutive-capitalized-words mid-sentence: > Alice Gordon -- (4) > And Alice -- (1) > Aunt Sarah -- (1) > Chintz, Hervey -- (1) > Dominion Ranch -- (1) > Dr. Parash -- (1) > Free Press -- (1) > Front Hill -- (2) > G-- The -- (1) > George Iredale -- (14) > German, Grieg -- (1) > Get Prudence -- (1) > Gordon Duffield -- (1) > Grey.' Those -- (1) > Harry Gleichen, Mr. Danvers -- (1) > Haunted Hill -- (4) > Hephzibah Malling -- (6) > Iredale, Hephzibah -- (1) > Iredale, She -- (1) > Leslie Grey -- (14) > Lonely Ranch -- (13) > Loon Dyke -- (4) > Loon Dyke Farm -- (8) > Miss Covills -- (1) > Mr. Chillingwood -- (1) > Mr. Danvers -- (4) > Mr. George Iredale -- (1) > Mr. Grey -- (1) > Mr. Iredale -- (9) > Mr. Malling -- (1) > Mr. Robb Chillingwood -- (1) > Mrs. Covill -- (1) > Mrs. Ganthorn -- (2) > Mrs. George Iredale -- (1) > Mrs. Gurridge -- (1) > Mrs. Malling -- (17) > No, Alice -- (1) > Northern Union Hotel -- (1) > Owl Hoot -- (15) > Owl Hoot He -- (1) > Owl Hoot How -- (1) > Owl Hoot Valley -- (1) > Peter Furrer -- (1) > Peter Furrers -- (2) > Prudence, Iredale -- (1) > Rev. Charles Danvers -- (1) > Robb Chillingwood -- (4) > Robe Chillingwood -- (1) > Rocky Mountains -- (1) > Sarah Gurridge -- (6) > Then Leslie -- (1) > Winnipeg Free Press -- (2) > Yukon Valley -- (1) a list of mid-sentence capitalized-words: > Ainsley > Al > Alaskan > Alice > American > Andy > August > Aunt > Badlands > Canada > Canadian > Canuk > Chillingwood > Chinese > Chintz > Covill > Covills > Customs > Dakota > Danvers > Deane > Dominion > Duffield > Dyke > East > Emma > Fate > Furrer > Ganthorn > George > German > Gleichen > God > Gordon > Government > Grey > Grieg > Gurridge > Harry > Hephzibah > Hephzy > Hervey > Hill > Hoot > Hotel > I.O.U.s > Indian > Indians > Iredalc > Iredale > Jove > Lake > Lakeville > Leonville > Leslie > Lord > Malling > Manitoba > Methodist > Minister > Minnesota > Miss > Mother > Mr > Mrs > Nature > Neche > Niagara > Owl > Parash > P-Press > Peter > Prudence > Prue > Raach > Ranch > Rev > Robb > Rockies > Sarah > Scot > Scotia > Scotland > Shire > Silas > Spartan-like > States > Stetson > Tim > Timothy > Toronto > West > Winnipeg > Woods ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hart at pglaf.org Mon Jan 5 08:50:02 2009 From: hart at pglaf.org (Michael Hart) Date: Mon, 5 Jan 2009 08:50:02 -0800 (PST) Subject: [gutvol-d] Year Ends in 49 Hours Message-ID: Just in case anyone wants to get any file into the catalog as having been produced in our 2008 production year, which years run from noon on the first Wednesday of one year to noon of the first Wednesday of the following year. . .so all counting will take place Wednesday, at noon, for the final tally. Thanks!!!!!!! Michael From hart at pglaf.org Mon Jan 5 09:09:35 2009 From: hart at pglaf.org (Michael Hart) Date: Mon, 5 Jan 2009 09:09:35 -0800 (PST) Subject: [gutvol-d] !@! Creating eText of The Inaugural Address Message-ID: As you may know, Project Gutenberg provides a rather instant online edition of The Inaugural Address every four years. Volunteers will be needed to record the address and type it in as best they can. We will then compare these various entries to create a 1st edition, hopefully online within an hour of the address. Then we usually create a 2nd and 3rd edition correcting any errors, resolving punctuation issues [hard to know those with the speeches] and generally polishing up the whole thing over the next two hours. If any of you would care to volunteers, please let me know, as this will all take place 15 days from right now. Thanks!!! Michael From jared.buck at gmail.com Mon Jan 5 09:11:26 2009 From: jared.buck at gmail.com (Jared Buck) Date: Mon, 5 Jan 2009 09:11:26 -0800 Subject: [gutvol-d] !@! Creating eText of The Inaugural Address In-Reply-To: References: Message-ID: I'd be happy to volunteer to do that, I'm a registered Democrat and I voted overhelming for Obama and am looking forward to the Inauguration. Plan on watching the address and then checking news sites for the text of the speech. jared On Mon, Jan 5, 2009 at 9:09 AM, Michael Hart wrote: > > As you may know, Project Gutenberg provides a rather instant online edition > of The Inaugural Address every four years. Volunteers will > be needed to record the address and type it in as best they can. > > We will then compare these various entries to create a 1st edition, > hopefully online within an hour of the address. > > Then we usually create a 2nd and 3rd edition correcting any errors, > resolving punctuation issues [hard to know those with the speeches] > and generally polishing up the whole thing over the next two hours. > > If any of you would care to volunteers, please let me know, as this > will all take place 15 days from right now. > > > Thanks!!! > > > Michael > > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Jan 5 11:24:51 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 5 Jan 2009 14:24:51 EST Subject: [gutvol-d] !@! Creating eText of The Inaugural Address Message-ID: michael said: > instant online edition of The Inaugural Address sounds very collaborative, and fun. i'm wondering if it'll be needed this year, though. obama has already been releasing a lot of stuff -- "early and often", as the saying goes -- and i'd think the inauguration speech would be ripe. it almost goes without saying it'll be on youtube. and with such a gifted speaker, it's much better to _hear_ the delivery than just read the speech. it'll still be great to have the text, of course, but in terms of a special effort, it might be the case that the times finally caught up to you, michael. jon noring is still trying to steal your thunder, but the rest of us acknowledge you were too far ahead of your times, the curse of the prescient... as practical note, i'd say the wiki is your best tool. -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From creeva at gmail.com Mon Jan 5 11:34:25 2009 From: creeva at gmail.com (Brent Gueth) Date: Mon, 5 Jan 2009 14:34:25 -0500 Subject: [gutvol-d] !@! Creating eText of The Inaugural Address In-Reply-To: References: Message-ID: <2510ddab0901051134r1a1245bco6bc182125ebf0df3@mail.gmail.com> I disagree - I trust the longevity of PG over youtube - It should be saved in PG. Even if people would rather watch it on youtube, there is no reason to break tradition for something that only takes an hour. On Mon, Jan 5, 2009 at 2:24 PM, wrote: > michael said: > > instant online edition of The Inaugural Address > > > sounds very collaborative, and fun. > > i'm wondering if it'll be needed this year, though. > > obama has already been releasing a lot of stuff > -- "early and often", as the saying goes -- and > i'd think the inauguration speech would be ripe. > > it almost goes without saying it'll be on youtube. > and with such a gifted speaker, it's much better > to _hear_ the delivery than just read the speech. > > it'll still be great to have the text, of course, but > in terms of a special effort, it might be the case > that the times finally caught up to you, michael. > > jon noring is still trying to steal your thunder, > but the rest of us acknowledge you were too far > ahead of your times, the curse of the prescient... > > as practical note, i'd say the wiki is your best tool. > > -bowerbird > > > > ************** > New year...new news. Be the first to know what is making headlines. ( > http://www.aol.com/?ncid=emlcntaolcom00000026) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Jan 5 12:46:45 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 5 Jan 2009 15:46:45 EST Subject: [gutvol-d] !@! Creating eText of The Inaugural Address Message-ID: brent said: > I disagree who is it that you are disagreeing with? not me. i said, quite specifically: > it'll still be great to have the text the question is whether it'll need to be done _immediately_, or whether we could just wait until obama releases the text version himself. but hey, if people wanna do it right away, fine! so tell us, brent, were you volunteering to help? -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From creeva at gmail.com Mon Jan 5 12:58:43 2009 From: creeva at gmail.com (Brent Gueth) Date: Mon, 5 Jan 2009 15:58:43 -0500 Subject: [gutvol-d] !@! Creating eText of The Inaugural Address In-Reply-To: References: Message-ID: <2510ddab0901051258s1334a96fu350ea5cddf481cbf@mail.gmail.com> I don't plan on watching the address - if I was, I would definitely help with the transcription. You say we should await for the official transcription, I'm sure in past years PG's wasn't nec. the first or the authoritative copy online - if anything I think I would look at this as a community event more then a milestone. I normally keep quiet on the message threads (and agree with Bowerbird occasionally), this year I'm going to be a a bit more active in PG - but more in relations to getting some new (old) books scanned and trying to buff up the PG archive of sheet music. If there is a need because of lack of volunteers, and my wife is not yet in labor - I can definetly help out, I'm fairly sure I won't be available on the 20th to actually partake. I was only disagreeing with you in the fact that I think PG needs community events like this, technology may make things quicker, and some things unnecessary - but if it's something that is traditional, keep it for as long as possible. On Mon, Jan 5, 2009 at 3:46 PM, wrote: > brent said: > > I disagree > > who is it that you are disagreeing with? > > not me. > > i said, quite specifically: > > it'll still be great to have the text > > the question is whether it'll need to be done > _immediately_, or whether we could just wait > until obama releases the text version himself. > > but hey, if people wanna do it right away, fine! > > so tell us, brent, were you volunteering to help? > > -bowerbird > > > > ************** > New year...new news. Be the first to know what is making headlines. ( > http://www.aol.com/?ncid=emlcntaolcom00000026) > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Jan 5 23:04:55 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 6 Jan 2009 02:04:55 EST Subject: [gutvol-d] !@! Creating eText of The Inaugural Address Message-ID: brent said: > if anything I think I would look at this as > a community event more then a milestone i agree. a nice community event. might work out the kinks so that the error-correction team picks up where this leaves off, such that the cleaning of a specific e-text will be focus of the next community event. maybe clean one text every day for the whole year of 2009, it'd be nice. -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Mon Jan 5 23:12:26 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 6 Jan 2009 02:12:26 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-05 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** hi. if you're just coming back to e-mail after the holidays, i am doing another "how to clean o.c.r." series this month, much like the one i did last july. remember? wasn't it fun? so far, we renamed the files to a more-intelligent system. we then checked the _paragraphing_ of the o.c.r. output... then we paid special attention to paragraphs at pagebreaks. then yesterday we compiled a list of the names in the book. *** here's today's lesson: 2009-01-05. check out the sentence-terminators in the book... *** as discussed yesterday, this check is fairly simple. we look for instances of any capitalized word which appears mid-sentence, with the exception of the _names_ that we collected yesterday... when we find a capitalized word that doesn't follow a word that is sentence-terminated (e.g., ends with a period, question-mark, or exclamation-point, or any of those characters followed by a double-quote), then we need to inspect it to see if it is an error. odds are that it will be one of two types of errors. the first is that the sentence-termination is an error, usually a "period" that was actually a comma, or just a speck on the page. the second error-type is where the capitalized letter was incorrect. (most typically, it's a lowercase letter misrecognized as uppercase; the letter "i" is especially prone to this, but "u" and "w" happen too.) *** the o.c.r. on this book was especially prone to this problem, with the sentence-terminating period being lost by the o.c.r. there are no less than 53 cases of this problem with this o.c.r. (and remember, we're just using 117 pages of the book now.) 46 of these 53 errors were corrected, during either p1 or p2. but these are the 7 cases that the d.p. proofers missed: dp> dayless night The horses walked with ears pricked me> dayless night. The horses walked with ears pricked 01> http://z-m-l.com/go/hound/houndp112.html dp> front of it An unbroken level of smooth prairie me> front of it. An unbroken level of smooth prairie 02> http://z-m-l.com/go/hound/houndp115.html dp> her brain in an unchecked torrent It seemed to me> her brain in an unchecked torrent. It seemed to 03> http://z-m-l.com/go/hound/houndp127.html dp> the dying man and administered more of the spirit me> the dying man and administered more of the spirit. 04> http://z-m-l.com/go/hound/houndp130.html dp> the door, for the place was in the nature of a dugout me> the door, for the place was in the nature of a dugout. 05> http://z-m-l.com/go/hound/houndp137.html dp> The girl ran off, letting her skirt fall as she went me> The girl ran off, letting her skirt fall as she went. 06> http://z-m-l.com/go/hound/houndp144.html dp> "Nothing of the sort" me> "Nothing of the sort." 07> http://z-m-l.com/go/hound/houndp165.html *** there was one case of a misrecognized lowercase letter: dp> The object was a sleigh. And the speed at which It me> The object was a sleigh. And the speed at which it 88> http://z-m-l.com/go/hound/houndp121.html *** ok, it's probably time for another reminder about my point here. i'm _not_ saying, "look, the computer can do better than humans". first of all, that's not even true. the computer missed 7 cases too, where the capitalized word following a missed period was a name. and the computer will continue to "miss" those cases, no matter how many times you run that routine, whereas human proofers -- if you keep doing more rounds -- might well find their errors. second of all, a computer/human comparison is beside the point, because there's no reason we can't use _both_ to do our proofing. my _point_ here that we should use the computer _first_. use the computer to flag all of the instances that _it_ can locate, so we fix 'em, and then use human brains to find the remaining errors which the computer's brainless routine _could_not_ locate. why use human energy to find errors that the computer can find? it's an unwise use of human resources, a waste of volunteer time. moreover, humans inevitably miss some of the errors. but if we lower the total number of errors that the humans have to locate, by judiciously applying our computer routines _before_ humans, we will also lower the number of errors humans inevitably miss... in this case, for just this one type of error, we could have located 46 of 53 cases of that error by using the computer routine _first_, so then the humans would have just had to find the remaining 7. why make your volunteers work harder than they have to? *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... see you tomorrow! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Jan 6 15:05:44 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 6 Jan 2009 18:05:44 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-06 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-06. examine any cases of numbers within the text... *** we'll take it easy today, and just look at cases of numbers in the text... we find 5 such cases: > "Ther1?? ain't no one aboard of that sleigh," he called > her brain in an unchecked torrent It seen ??1 to > afternoon's 'chicken shoot.1 He says the prairie-chicken > I'll join you directly. Where are you? In the wash-1 > Dead'ning a mind to lofty thought for which by nature meant.1?? all of them are errors, easily fixed after referring to the scan... *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... see you tomorrow! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Tue Jan 6 15:26:34 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 6 Jan 2009 18:26:34 EST Subject: [gutvol-d] more rothman follies Message-ID: catching up on my year-end reading, i see david rothman had a blog entry on the last day of 2009 that is funny... seems that david finally noticed that his beloved .epub format is hard to create... so he's issued a "challenge" for someone -- anyone! -- to write an app to help ease book-making with his "standard" format... an authoring-tool! what a fantastic idea! why didn't someone think of this before! what kind of idiot adopts a format that has no working code as reference implementation? what kind of idiot urges _us_ to be so foolhardy? of course, .epub still needs a _viewer-app_ as well. the one david has been flogging -- stanza -- strips out all the formatting that people carefully and patiently put in. just a minor technicality to david, i guess. -bowerbird p.s. the entry's comments are comical too. ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Jan 7 10:52:40 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 7 Jan 2009 13:52:40 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-07 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-07. check for bad or atypical punctuation patterns... *** another easy one today. punctuation patterns that raise flags... period at the start of a line? flag it. exclamation-point at the start of a line? flag it. question-mark at the start of a line? flag it. dash at the start of a line? flag it. comma at the start of a line? flag it. comma with a space in front of it? flag it. comma without a space after it? flag it. period with a space in front of it? flag it. period without a space after it? flag it. semi-colon with a space in front of it? flag it. semi-colon without a space after it? flag it. colon with a space in front of it? flag it. colon without a space after it? flag it. there are a million different anomalies you need to check for... in the typical scan-set for a book, you might get a dozen flags, maybe even two or three dozen, but sometimes just a handful... here is a representative foursome from this book: line starts with a question-mark... p0> ? I'll tell you; you have brought her nothing p1> I'll tell you; you have brought her nothing line starts with a question-mark... p0> ?aw the ranch through the trees, and he greeted p1> saw the ranch through the trees, and he greeted comma not followed by whitespace... p0> contract,she spoke about for her." p1> contract, she spoke about for her." p2> contract she spoke about for her." exclamation-point not followed by whitespace... p0> "And my mare nearly threw me!n her fright." p1> "And my mare nearly threw me in her fright." *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... see you tomorrow! -bowerbird ************** New year...new news. Be the first to know what is making headlines. (http://www.aol.com/?ncid=emlcntaolcom00000026) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Thu Jan 8 12:43:05 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 8 Jan 2009 15:43:05 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-08 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-08. fix the spacey-quotes, both double and single... *** this is gonna be one of those very unpleasant posts for rfrank, as his handling of spacey-quotes remains absolutely dreadful. first of all, let's review the matter from a practical standpoint... it's easy to fix spacey-quotes. you can confirm it for yourself. assume we're fixing double-quotes. it's drop-dead simple. 1. split the book into paragraphs. 2. step through each paragraph, 3. counting the number of double-quote-marks, and 4. if a spacey-quote is an odd-number, make it _open_; 5. if a spacey-quote is an even-number, make it _closed_. 6. if a paragraph has an odd number of quotes total, then 7. flag it if the next one doesn't begin with a quote-mark. 8. also throw a flag if a quote-mark is inappropriate, e.g., 9. if an open-quote isn't odd, or a close-quote isn't even. for #9, an "open-quote" is one that's preceded by whitespace, while a "closed-quote" is one that's followed by whitespace... that's it. that's the routine, in toto. that's all you need to do. it fixes 99% of the spacey quotes correctly, every time, blind. even better, it'll draw your attention to quote-mark glitches. like i said, you can confirm it yourself, with any o.c.r. output. search for a spacey-quote, and then examine its paragraph. you will see that it works, and quite well, thankyouverymuch. *** instead of that simple routine, which i have detailed before, you'll find it difficult to believe what rfrank is doing instead. recall the first tenant of the hippocratic oath: do no harm... what rfrank does with spacey-quotes makes 'em even worse. because he inserts a "bullet" character by each spacey-quote. this means that -- in _addition_ to deleting one space to fix the spacey-quote -- the proofer has to delete the bullet too! now of course rfrank is gonna say the bullet draws attention, so that's why it's there. and you might be tempted to say that there's a better way to flag it. maybe highlight it in wordcheck, refuse to save any proofed text containing a spacey-quote, etc. but all of that is _beside_the_point_, the point being that _all_ of the spacey-quotes can be fixed -- easily and automatically -- before _any_ of the text ever goes in front of the first proofer... they've got a precious resource over at d.p. -- people who have _volunteered_ to proof every single word of text against a scan -- and they are squandering that precious resource by having it do brain-dead work that computers can do _faster_ and _better_. it's a metaphor for how our society treats oil. once we've used up almost all of it, we'll come to realize exactly how valuable it really was, and we'll regret that we just burned it! *** you can use the routine to fix any spacey-single-quotes too. you just need to control for _contractions_ -- easy enough, as those single-quotes have white-space on neither side -- and s-apostrophe _possessives_ plus the occasional _slang_. (the interdependencies with slang are actually quite useful.) *** for any e-book programmers out there, this routine is also how you transform "straight" quotes into the "curly" variety... this lets you offer curly-quotes to readers who prefer them, while making it simple for the content providers by allowing them to use straight-quotes. (plus it's also the case that your _search_ routines are more robust based on straight-quotes.) *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... see you tomorrow! -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215047751x1200957972/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Jan 9 00:30:16 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 9 Jan 2009 03:30:16 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-09 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-09. fix dashes, em-dashes, and hyphenation issues... *** there are a number of issues that relate to the dash character, which is used for dashes, em-dashes, hyphenates, and so on... *** the first thing i fix are em-dashes. it doesn't matter -- much -- if you use the d.p. convention ("clothing" an end-line em-dash, and strictly removing any spaces on either side of the em-dash) or my method ("floating" an em-dash with a space on each side). and it doesn't matter if you allow consecutive em-dashes, or not. but what you _do_ want to fix are 3-dash em-dashes: > come back, and she welcomes you with open arms---we > slowly---almost helplessly. The man had leaned this o.c.r. also had some lines with 5-dash em-dashes: > "It can't-----" The minister got no further, and > "It's-----" some one said and broke off. Then > G-----" The last word died away to a gurgle. A > snarl at, you d-----d cur," Hervey said, between his > much a reality. I cannot marry you--until--until-----" it even had some lines with 6-dash em-dashes: > "You don't say------" Mrs. Mailing gasped; it was > hand, "another mile of this d------d valley and I > overwhelming. If I can--er--be------" > security on my certain inheritance of the farm------" > "Curse the man for his d------d superiority," he > before------" She broke off, only to resume again with > may love me now; I believe I love him, but------No, > "Are------Oh, come away, I can't stand it" > were going to get it------Look out!" > "I wonder," said Alice; ?" perhaps he has dis-covered------" > very, very much, but------" > discover the wretch who did him to death------" > and then------No, mother and I will see the matter *** next up for checking are the cases of a single-dash... of course, a single-dash shouldn't occur at the start of a line: > -us. He's always over here. And he never by any > -forgotten him. Put yourself in my place--put > -is even now inserted at certain times. The man or > -I despised myself. My conscience cried out. *** i find it useful to pull out all mid-line hyphenates into a list... i review them to spot the occasional misrecognition and such. i've appended the list of mid-line hyphenates to this post. if you feel like it, review it yourself to see what you can spot. *** these are the lines for the cases that i thought need attention: > contempt in which he held-all law-breakers, rather > contempt in which he held all law-breakers, rather > http://z-m-l.com/go/hound/houndp113.html > feil-to with characteristic avidity, complaining vociferously > fell to with characteristic avidity, complaining vociferously > http://z-m-l.com/go/hound/houndp136.html > she'd have a bully time. No cheese-or butter-making, > she'd have a bully time. No cheese- or butter-making, > http://z-m-l.com/go/hound/houndp143.html > I'll join you directly. Where are you? In the wash-1 > I'll join you directly. Where are you? In the wash- > http://z-m-l.com/go/hound/houndp144.html > a man about the house who idles. Mussy-a-me, he (correct) > http://z-m-l.com/go/hound/houndp144.html > "O-oh! I thought it was some one being murdered." (correct) > http://z-m-l.com/go/hound/houndp166.html > used to come frequent-like before--before--" with a (correct) > http://z-m-l.com/go/hound/houndp182.html > Almost without forewarning the road,-after rounding > Almost without forewarning the road, after round- > http://z-m-l.com/go/hound/houndp191.html > the isolation, the secret nightly doings, the unsuit-ability > the isolation, the secret nightly doings, the unsuit- > http://z-m-l.com/go/hound/houndp197.html > word muttered below his breath voiced his discovery- > word muttered below his breath voiced his discovery. > http://z-m-l.com/go/hound/houndp200.html > and every nerve tense-drawn. What was this--thing? > and every nerve tense-drawn. What was this-- > http://z-m-l.com/go/hound/houndp201.html > and how he had never failed to urge the undesir-ability > and how he had never failed to urge the undesir- > http://z-m-l.com/go/hound/houndp203.html > "I wonder," said Alice; ?" perhaps he has dis-covered------" > "I wonder," said Alice; ?" perhaps he has dis- > http://z-m-l.com/go/hound/houndp212.html > said in his low, pleasant voice. "What is this"- > said in his low, pleasant voice. "What is this" -- > http://z-m-l.com/go/hound/houndp214.html > in its true colours, and I saw myself as I really was-a > in its true colours, and I saw myself as I really was -- > http://z-m-l.com/go/hound/houndp220.html with 12 out of the 15 cases incorrect, that was a very good check. if you get a false-alarm rate of 20% on a check, that's acceptable. some of these glitches were specks misrecognized as a dash, but there are also cases where a dash should have been an em-dash, such as the instances on page 214 and page 220. other cases, specifically on pages 197, 201, 203, and 212, were due to some kind of problem in the d.p. dehyphenation routine, and not a misrecognition by the o.c.r. but i left them in anyway. *** i think that describes all the problems with the dash character... oh yeah, one other thing. it's often the case that o.c.r. _misses_ end-line hyphens. however, it's fairly easy to re-introduce 'em. the routine is simply to join the last word of each line with the first word of the next line, and see whether this new "word" is present in the dictionary. if it is, you probably need a hyphen. there are some false-alarms with this routine. ones i remember are "a/long" and "for/a" and words starting with every/any/no... it's not difficult to think of other ones, either, such as "be/long". at some point, i'll build up a dictionary of these "exceptions" so this re-hyphenation routine is more robust, but it's fine for now. i'll also compile a dictionary of mid-line hyphenates some time, which will o.k. all the seemingly-valid ones, so it'll be easier to focus on ones needing attention. but that's fine for now too... *** i've also recently started changing the "soft hyphens" that occur on end-line hyphenates to the "~" character, to facilitate rewrap. d.p., of course, simply eliminates the hyphen, rejoining the word. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... see you tomorrow! -bowerbird p.s. here is the list of mid-line hyphenates within this o.c.r. a-courtin' a-follerin' account-books after-effect all-important arch-plotter bacon-yielding bad-men bake-house book-case boulder-strewn business-like butter-making carefully-finished carefully-laid cattle-raising cheese-or cheque-book choke-cherries clean-cut cleaning-rod common-sense curiously-assorted currant-bushes cut-bank dark-green day-long dead-and-gone dead-house death-chamber dis-covered discovery- dressing-gown dressing-table dun-coloured ear-piercing elbow-joint eye-witnesses far-distant farm-wife feil-to ferret-faced fishing-boat fishing-tackle fool-headed fore-paws fore-sight frequent-like fur-clad good-bye good-humour good-night gun-rack gun-racks half-a-dozen half-breed half-hour half-veiled heavily-shod heel-posts held-all hell-belt-fer-leckshuns hiding-place high-pitched horse-thieves ill-concealed ill-manners in-catching iron-grey jack-rabbits law-abiding law-breakers law-breaking light-blue light-heartedly log-hut long-drawn long-enduring long-sighted milk-pans money-making money-raising moss-backs mud-filling mussy-a-me neck-yoke nerve-racking night-cap non-appearance note-paper mussy-a-me o-oh off-saddle off-side official-looking on-coming passer-by pent-up pine-cones plate-layers potato-parings prairie-bred prairie-chicken prairie-lands pre-occupied rain-clouds re-adjusted re-echoed re-saddled rice-lake riding-whip road,-after rush-bottomed saddle-bags saddle-blankets saddle-up safe-guarded school-friend school-house school-ma'am screech-owl screech-owls self-accusations self-disgust self-imposed self-loathing self-possessed self-realization semi-circle sewing-machine shirt-sleeves sitting-room six-chambered sixty-two skirt-folds skull-caps slap-jacks sleigh-bells smash-up smock-shaped snow-drift so-called south-eastern south-west spartan-like stock-raising stone-covered stone-deaf stone-marked storm-clouds storm-porch strange-looking sun-bonnet sun-burnt sun-tanned sweet-smelling swift-treading table-cover tell-tale tense-drawn this"- three-legged tie-post tightly-closed to-day to-morrow town-bred tree-trunks two-year-old tying-posts under-garments under-shirt undesir-ability unsuit-ability various-shaped was-a wash-1 wash-house wax-like weather-boarding weirdly-outlined well-loved well-rounded well-scrubbed whisky-and-water wildly-staring wood-covered wool-worked work-table wrong-mindedness and these are the specific cases that i thought needed more attention: cheese-or dis-covered discovery- feil-to frequent-like held-all mussy-a-me o-oh road,-after tense-drawn this"- undesir-ability unsuit-ability was-a wash-1 ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215047751x1200957972/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sat Jan 10 20:41:04 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 10 Jan 2009 23:41:04 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-10 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-10. check for, and fix, any weirdnesses that appear... *** ok, we're closing in on the finish here... we're checking for weird characters now. depending on the book, some of these might be ok. but in general, they are o.c.r. errors... here's a reg-ex that should reveal weirdnesses: > [`~@#$%^&*()_+={}\\/] sure enough, in this book, this search pointed to this: > more resumed its sway, and thought flowed th?Y)ugh > her brain in an unchecked torrent It seen ^1 to > slight frame, and scalding tears coursed down he* > women that ?'$1 never let the child rest a minute." > * By Jove I you are a good sort, George. If you > * How much will appease your creditors?" > "I'm not what you might call a ?' free agent* There > in, both of you, or the ?' slap-jacks' ?'$1 all be spoiled." > pool of water. Let it stand, and i* vcrrodes with > Dead'ning a mind to lofty thought for which by nature meant.1* these 4 lines too, but these "errors" were "introduced" by d.p. > by his decisions in matters which required more con-* > *sideration than she could give--matters which were > spectres marching in procession through the mys-* > *terious graveyard, but real, live, human beings. What, myself, since i prefer to delete unnecessary high-bit characters, i also do a search for them. in this particular o.c.r., that found: > Leslie Grey looked ??t his watch; the hands indicated > more resumed its sway, and thought flowed th??Y)ugh > the set ??f her features. A hard, relentless look had > Prudence, nor was he slow to appreciate the possl?? > ?? When shall you return?" all of those were actually errors. no diacritics in this text... oh yeah, good thing i did that check for high-bit characters, as that's how i found all the "bullets" rfrank put by spacey-quotes (and other questionable places). i just mass-deleted all of 'em. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any weirdnesses that appear... see you tomorrow! -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Sun Jan 11 15:40:20 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sun, 11 Jan 2009 18:40:20 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-11 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-11. do spellcheck, with the book-specific dictionary... *** ok, now it's time to do a spellcheck. yeah, i know, if you're from d.p., you're saying "hey wait, you want us to do a spellcheck _before_ it goes to the proofers? that's the job of the proofers, to do the spellcheck! why even bother proofing a book if you're gonna do a spellcheck first?" that's what they say at d.p., but i think it's an idiotic position. once you have a book-specific dictionary -- and remember that we compiled a list of the names in the book, with which to supplement the regular dictionary -- spellcheck returns _very_few_ occurrences, and most of those are _real_ errors. why would you waste energy to locate those errors manually? take advantage of the fact that computers can pinpoint them! since you can do this spellcheck in a matter of mere minutes, it's a tremendous investment that creates a very clean e-text. _then_ when your humans examine every word on every page, it's merely to _confirm_ that the computer got things right -- or, in those rare cases, to determine the computer messed up -- which means that you have increased certainty that a page doesn't need to have further review, that it actually _is_ clean. if there are errors on a page, the simple fact -- proven over and over and over again, if you just _look_ at the d.p. data -- is that the human will catch _most_ of them, but miss _some_. by investing a little time to get each page as clean as possible to begin with, you increase the chances that the human being will find _the_last_error_ on a page (if, indeed, there still is one). if d.p. had a decent spellchecker (yes, i know about wordcheck), they would know this already. there's no sense in using humans to catch glitches that a computer can find _faster_ and _better_. save human resources for the bugs the computer _can't_ find... stuff like stealth scannos, p-book errors, bad punctuation, etc. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... see you tomorrow! -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Jan 14 15:06:57 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 14 Jan 2009 18:06:57 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-12 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using part 2 of the book "the hound from the north"... > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 *** here's today's lesson: 2009-01-12. seek out external validation, when it's possible... *** at this point in time, our clean-up of the o.c.r. text should be fairly complete... we're lucky, in this case, because this o.c.r. was subjected to human proofing at distributed proofreaders, so we can actually get some feedback on our overall success... we'll compare the d.p. output with our computer-cleaned text. as i've proven here time after time, this comparison method is a _most_excellent_ way to find errors in _both_ digitizations... and, of course, no reason we can't _combine_ these methods. the synthesized e-text might still have errors -- mistakes that _neither_ my computer routines nor the d.p. human proofers managed to spot -- but sadly, that's always gonna be the case. that's why we need to actively recruit the public to report errors. since thus far we've only been working on one _part_ of the book, the results at this specific time aren't really all that important yet. we'll wait to do this comparison until we've done the whole book. so, what we'll do from now on is just re-trace our steps, but with _all_three_parts_ of the book -- as opposed to simply part 2 -- to see how well that that part 2 represented the book as a whole. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... 2009-01-12. seek out external validation, when it's possible... -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Jan 14 15:09:53 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 14 Jan 2009 18:09:53 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-14 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using the book "the hound from the north", available at d.p. > http://www.pgdp.net/c/project.php?id=projectID494113af10e06 > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 > http://www.pgdp.net/c/project.php?id=projectID494113f2b0f5f *** and now we extend the lessons from part 2 to the entire book... 2009-01-14. rename files and change separators, if necessary. we do this for the whole book. > http://z-m-l.com/go/hound/houndp001.html *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... 2009-01-12. seek out external validation, when it's possible... and now we extend the lessons from part 2 to the whole book: 2009-01-14. rename files and change separators, if necessary. see you tomorrow! -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Thu Jan 15 20:45:35 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 15 Jan 2009 23:45:35 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-15 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using the book "the hound from the north", available at d.p. > http://www.pgdp.net/c/project.php?id=projectID494113af10e06 > http://www.pgdp.net/c/project.php?id=projectID494113cf71fd9 > http://www.pgdp.net/c/project.php?id=projectID494113f2b0f5f *** 2009-01-15. fix the paragraphing, including the terminators... detailed report on this tomorrow, when we finish paragraphing. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... 2009-01-12. seek out external validation, when it's possible... and now we extend the lessons from part 2 to the whole book: 2009-01-14. rename files and change separators, if necessary. 2009-01-15. fix the paragraphing, including the terminators... see you tomorrow! -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Wed Jan 21 14:34:38 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 21 Jan 2009 17:34:38 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-16 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using the book "the hound from the north", available at d.p. the book was previously split into 3 parts, but is now consolidated: > http://www.pgdp.net/c/project.php?id=projectID494113af10e06 *** 2009-01-16. fix the paragraphing around the pagebreaks too. did you miss this? :+) here's a good long post, chock-full of some juicy examples... ;+) for those who want to follow along with the actual d.p. o.c.r. output, you can download it from that u.r.l. which i just gave you up above -- "download concatenated text", and "ocr" and "latest text saved" -- or you can save yourself a little bit of the hassle and get it right here: > http://z-m-l.com/go/hound/hound-dp-ocr.txt *** ok, so far, we've fixed the filenames and changed the separators. since the offset was -6 throughout, this task was extremely easy. it's worth noting that -- once the files have been wisely named -- it's a very simple step to put them on the internet, so i usually do. that means that i can immediately refer other people to the scans. most of the time, that means you people here, but it could also be people that i'm proofing with collaboratively. moreover, my tools are all built such that they can automatically download the scans from the web to the current machine, so i upload them right away so as to employ that functionality immediately for other people... having the scans on the web, with the text alongside, also means anyone can step through the pages to see what i'm talking about. the usual model at p.g. and d.p. is that putting the e-text "online" means that it's "finished". for me, though, it is just the beginning. *** back to "hound from the north"... we will now fix all the paragraphing and paragraph-terminators, specifically including those located right around the pagebreaks. part of the process of fixing paragraphs involves the elimination of superfluous lines from the o.c.r. there are several methods for detecting these superfluous lines, which we can discuss here now. here are some of the heuristics i have used at one time or another: 1. short lines (<20 characters) which are not sentence-terminated. 2. lines that have no lowercase letters. 3. lines that start with a number. 4. lines that end with a number. 5. lines that contain a number. here are some of the lines that are returned with such heuristics: > CHAM > * > D > M > 80 THE HOUND FROM THE NORTH > 81 G > M > IT > OPIUM. > ,' > R > A STAB IN THE DARK 253 > A STAB IN THE DARK 25$ > 266 THE HOUND FROM THE NORTH > 373 T > 331 > 333 > 337 > 341 except for the "opium." line, all of these lines should be eliminated. (all run-head lines are eliminated from d.p. output, per their policy. i don't think this is a smart policy, but that's what they do, so be it.) *** here are some more lines that were returned from those heuristics, distinct from the above in that they each contain lowercase letters: > sight > breakfast > moment > it > accent > imminent > breakfast > man.'" > ,' > 'Bully.'" > thought > for it* > night > night most of these are lines that simply lost their paragraph termination, so they need to be repaired rather than deleted. that's easy enough. *** you should also flag short lines with mixed capitalization, like these: > tAG* > fHK BUD the top one was the "page" header you find on the table of contents. the bottom one was "the end". (sometimes o.c.r. is unbelievably bad.) *** here are some longer lines with mixed capitalization: > even take the trouble to quArrel." > "You know the ranch and its surroundings welL > "Maybe she learned you, my girL" > impassable. She looked back along the traiL The easy enough to fix these by simply referring to the page-scan... *** here are the lines that were flagged because they contain the number "1": > II. MR. ZACHARY SMITH . . . 15 > VI. THE PROGRESSIVE EXJCHRE PARTY . 81 > vm. GREY'S LAST WORDS . ,115 > IX. LONELY RANCH AT OWL HOOT , ? 133 > X THE GR>VEYARD AT OWL HOOT . . 157 > XIX. THE AVENGER.. .. 311 > "Yes, sir," this individual was saying, "she's goin1 > "They reckon that the ' rush' to the Yukon '$1 come > good. My weed '$1 do me. You don't fancy to try > 'YELLOW BOOMING--SLUMP IN GREY1 > 65 r > 81 G > no pleasing some folks. I s'pose Mr. Chillingwood '$1 > "Ther1 ain't no one aboard of that sleigh," he called > her brain in an unchecked torrent It seen ^1 to > women that '$1 never let the child rest a minute." > afternoon's 'chicken shoot.1 He says the prairie-chicken > I'll join you directly. Where are you? In the wash-1 > in, both of you, or the ' slap-jacks' '$1 all be spoiled." > Dead'ning a mind to lofty thought for which by nature meant.1* > "River '$1 stop it" > dog '$1 help me to discharge my debt. Good-bye, Al; > 331 > 341 all of them need some kind of work, as far as i can tell... *** since we did the number "1", might as well finish off the other numbers: > enjoying a bare existence on an income of $5 Per > 0 Slow work," said the stranger, indifferently. *** ok, now what you are left with, from those heuristics above, once you've deleted and edited all of the _short_ lines, is this: > "AND EVERY NOW AND THEN IT WOULD CEASE ITS HEALING > OPERATION TO THROW UP ITS LONG MUZZLE ANI> > EMIT ONE OF THOSE DRAWN--OUT HOWLS." > AUTHOR OF > A. L. BURT COMPANY > (INCORPORATED) > CONTENTS > I. IN THE MOUNTAINS > II. MR. ZACHARY SMITH . . . 15 > III. MR. ZACHARY SMITH SMOKES . ? 29 > V. THE RETURN OF THE PRODIGAL . . 65 > VI. THE PROGRESSIVE EXJCHRE PARTY . 81 > VII. LESLIE GREY FULFILS HIS DESTINY . . 98 > IX. LONELY RANCH AT OWL HOOT , ? 133 > X THE GR>VEYARD AT OWL HOOT . . 157 > XII. THE BREAKING OF THE STORM . . 2O2 > XIII. BLACKMAIL . . . . .226 > XIV. A STAB IN THE DARK . . . 240 > XV. THE MAGGOT AT THE CORK . . . 257 > XVI. AN ECHO FROM THE ALASKAN MOUNTAINS 373 > XVII THE LAST OF LONELY RANCH .. 286 > XVUI. THE FOREST DEMON PURSUES. . 506 > XIX. THE AVENGER.. .. 311 > IN CONCLUSION ... . 34! > HOUND FROM THE NORTH > CHAPTER I > IN THE MOUNTAINS > CHAPTER II > MR. ZACHARY SMITH > CHAPTER III > MR. ZACHARY SMITH SMOKES > CHAPTER IV > 'YELLOW BOOMING--SLUMP IN GREY1 > CHAPTER V > THE RETURN OF THE PRODIGAL > CHAPTER VI > THE PROGRESSIVE EUCHRE PARTY > CHAPTER VII > LESLIE GREY FULFILS HIS DESTINY > CHAPTER VIII > GREY'S LAST WORDS > CHAPTER IX > LONELY RANCH AT OWL HOOT > CHAPTER X > THE GRAVEYARD AT OWL HOOT > CHAPTER XI > CANINE VAGARIES > CHAPTER XII > THE BREAKING OF THE STORM > "ROBE CHILLINGWOOD." > CHAPTER XIII > BLACKMAIL > CHAPTER XIV > A STAB IN THE DARK > CHAPTER XV > THE MAGGOT AT THE CORE > CHAPTER XVI > AN ECHO FROM THE ALASKAN MOUNTAINS > CHAPTER XVII > THE LAST OF LONELY RANCH > CHAPTER XVIII > THE FOREST DEMON PURSUES > "PRUDENCE." > CHAPTER XIX > THE AVENGER > IN CONCLUSION with only a few exceptions, what we've got here are chapter-headers, both as they're listed in the table of contents, and throughout the book. this is very valuable, as it helps us grok the actual structure of the book, and it basically just _fell_out_ of our heuristics used to delete bad lines... remember that i said that i like to put the text and scans up on the web right away, at the beginning of the process. this ability to pull out the chapter-structure of the document helps make that web-product better, because it allows us to put in chapter-links, so people can "skim across" the top of the chapters, jumping back or forward from chapter to chapter. fluid navigation is one of the most vital elements of an electronic-book... i usually change the chapter-headers to titlecase, rather than uppercase. by the way, a perusal of the lines above informs us that some chapters are "missing" from the table of contents. they had lowercase letters so they weren't pulled here, and that lets us know we'll need to edit them. specifically, entries for chapters 4, 8, and 11 were not included above: > iv. 'YELLOW BOOMING--SLUMP IN GREY' . 46 > vm. GREY'S LAST WORDS . ,115 > XI. CANIJV'B VAGARIES . . . . l8l and sure enough, they need editing... note that the "lowercase letters" in the entry for chapter 11 are actually two occurrences of lowercase "l" as a misrecognition in the pagenumber of "181". that reminds us that we need to do a search for the cases where numbers and letters abut. that check turns up these lines: > XI. CANIJV'B VAGARIES . . . . l8l > XII. THE BREAKING OF THE STORM . . 2O2 so we see one of our other table-of-contents entries needs to be edited. *** as those last few examples illustrate, it's smart to use the 2 occurrences of each chapter heading -- in the table of contents, and the body itself -- as a _cross-check_ on each other. if they don't match, they need editing. (and yes, sometimes these two will _differ_ in a p-book, in a way that suggests that it might _not_ have been an error. nonetheless, this is one of the things that i -- as a republisher -- will edit when i digitize. i enforce a rule that the chapter-header in the table-of-contents must match that same chapter-header as listed in the body of the book itself. it's not just for consistency either. my automatic table-of-contents links are dependent on having these 2 occurrences match each other exactly. yeah, i could re-program that requirement. but i think it's a good one.) by the way, this _cross-check_ found 3 lines here that needed editing: > THE GR>VEYARD AT OWL HOOT > THE GRAVEYARD AT OWL HOOT > THE MAGGOT AT THE CORE > THE MAGGOT AT THE CORK > THE PROGRESSIVE EUCHRE PARTY > THE PROGRESSIVE EXJCHRE PARTY *** while we're on the table-of-contents, i'll note that you generally have to resign yourself to editing those manually, since they usually o.c.r. badly. in fact, frontmatter as a whole tends to need manual editing of its o.c.r. yeah, i've written code. and as i've shown above, there are cross-checks that you could use to engender a good amount of automatic editing, but you're still going to have to examine the table-of-contents very carefully, so it's usually just as fast to bite the bullet and doing the editing manually. *** we also want to flag any sentence-terminated-line which is followed by a blank line that is then followed by a line starting with a lower-case letter: > u Sakes alive, girl, yes. It's the way you have said. > murmured Sarah, with a pensive smile, while she > fHK BUD simple editing fixes these cases just fine... *** we also want to flag lines that start with a bad 1-character "word": > ? You don't say I Alone?" > ? Yes." > * The only thing I've had--that and my fur coat--to > ? Fair." > 0 Slow work," said the stranger, indifferently. > * Daylight," he murmured; "and they've let ?:he > * Storm," he observed shortly. > ? Bunkum!" > * Missed the trail," the other said, pitching a cord-* > ? Can I?" > ' Secrets withheld 'twixt man and wife, > ? Prudence, isn't it?" > ' broke.' He's too knocked up with travelling--he's > ' Secret'--as you are pleased to call it--of Lonely > * I don't know," replied Grey. > ' I have just remembered something. I came across > " "Ah, not vinegar." > ' backed ' there. Come on, man; you can get another > * By Jove I you are a good sort, George. If you > * How much will appease your creditors?" > u Sakes alive, girl, yes. It's the way you have said. > ? When shall you return?" > ' through mail' to the coast and have to make up > ' A life monotonous, unrelieved, breeds selfish discontent, > ' I'll close the window." Iredale moved across the > ? Is?" > * We'll run this thing for all it's worth. Hang to > ? Must?" > ' Yellow booming--slump in Grey,' was the man who > ' yellow-devils' are distributed by means of loaded > ' Nature designs all human ills, but in the making > * The price he parted with his cattle to me for was > * Um I I wonder where he got him from," in a > * Well, to work. Set the fires going." > ? Good." > ? Don't know." > * Not one cent, you cowardly hound!" he roared. as you can see, most of these were cases where the o.c.r. misrecognized the double-quote-mark, indicating dialog, at the start of a paragraph... in fact, when the line above a line like this is _blank_, you can pretty much make the change blind. the exception to the general rule is the case like: > ' yellow-devils' are distributed by means of loaded where the single-quote-mark at the beginning of the line is _supposed_ to be a single-quote-mark (albeit not one that is followed by a space), and you can detect these cases because they are _not_ preceded by a blank line. but again, it's best not to make global-changes anyway. program yourself a tool that lets you examine the scan in order to decide if you make an edit. *** ok, now that we've eliminated or edited all of the "garbage" lines in the file, we can do some of our more-straightforward checks to fix paragraphing... *** as paragraph-checking depends on correct paragraph-terminations, we want to fix paragraph-terminations before checking paragraphs... as hinted above, paragraph terminations were a big problem in this file. i found _200+_ places where the paragraph-terminations were suspect. the list is too long to put in the body of this post, so i have appended it... most of the time, with the o.c.r. in a typical book, you will usually find that these cases are fairly balanced between missing paragraph terminations and incorrectly inserted blank lines. in this book, however, the majority of these cases were due to missing paragraph terminations. and indeed, in the "project comments" forum for this particular book, a proofer said: > Beware, the last letter of some sentences, > especially the "t" tends to obscure > ending punctuation periods/full stops. rather than just adding that item to the mental list the proofers maintain, however, it's a lot smarter to have the machine search out those instances, as a concentrated task, so the person handling them can focus on just that. *** now that we've found and fixed all the paragraph-termination glitches, we can focus on finding and fixing the bugs in the paragraphing per se. the obvious check is to see if a lowercase line is preceded by a blank line. sure enough, we find -- and will fix -- some 61 lines like this in the file: 01> tAG* 02> iv. 'YELLOW BOOMING--SLUMP IN GREY' . 46 03> vm. GREY'S LAST WORDS . ,115 04> "that's my opinion," said Grey definitely. 05> c 06> the primest bacon. Hallo, here comes the d------d 07> neche. What's up now, I wonder? Well, Rainy-Moon, 08> head. 09> has blown itself out. You've missed the trail, I take 10> more--no less; and I set out on foot." He was 11> gun," he muttered, with an uneasy look at the 12> world. 13> to you." 14> on her sex without good reason. Her moral standard 15> forget." 16> waiting girl. 17> her. 18> dogone Indian." 19> me kick him." Then, after a pause, "?But I think he 20> me--and he did marry me. I was all sort of swept 21> off my feet" 22> dryly. 23> deep-sunken, cow-eyes of his------" Robb broke off as 24> he saw Grey start ?" Why, what's up?" 25> have been he. Ah------" He broke off and glanced 26> in the direction of the window as the jangle of sleigh-bells 27> i 28> all she could say. 29> he fingered his watch from force of habit. 30> followed an excited murmur. "What's Peter going 31> too late. They had seen the blood upon his hand. 32> face was blanched, but her mouth was tightly clenched. 33> prostrate man had vanished; a world of pity was in 34> her eyes as she silently looked on. 35> violent fit of coughing seized the dying man, then it 36> t 37> hand, "another mile of this d------d valley and I 38> should have turned tail and fled back to the open. 39> amount that Hervey promptly decided to double the 40> muttered. "?I suppose he thinks I am blind. Well, 41> before------" She broke off, only to resume again with 42> a fierce and passionate earnestness of which Alice 43> may love me now; I believe I love him, but------No, 44> u Sakes alive, girl, yes. It's the way you have said. 45> snarl at, you d-----d cur," Hervey said, between his 46> clenched teeth. Then he turned at the sound of his 47> about him curiously. 48> o 49> for further demonstration. The prolonged screech of 50> p 51> gazed appealingly into his face. And the man had 52> and then------No, mother and I will see the matter 53> through. We have already secured the services of 54> you say. You shall have the money. The rest shall 55> destroyed her happiness. But her words smote hard. 56> murmured Sarah, with a pensive smile, while she 57> u 58> lover on his way to the farm, and give him timely 59> door was flung open, and Hervey stood in their midst. 60> stopped her and drew her back. 61> fHK BUD again, it's best to examine these cases, rather than change them blindly. it's true there are very few false alarms, so you don't need to worry about _bad_changes_, but it's also true that examination of the cases can tip you about any nearby lines where excess blank lines were improperly inserted. an example of this occurs on page 125 (or 131 in d.p.'s warped numbers). the line that goes like this: > too late. They had seen the blood upon his hand. is indeed proceeded improperly by a blank line... but so are several other nearby lines, not detected by this routine since they actually begin with a capital letter. these include the lines: > Other girls looked as though they might follow suit. > Only Hephzibah Mailing stood her ground. Her > She uttered no sound. All her anger against the so, right there, we caught 3 additional errors by doing visual confirmation. *** another check is for a line that _ends_ with a double-quote-mark and which is _followed_ by a line that _begins_ with a double-quote-mark: > "What--about the farm?" > "Well, I wasn't just thinking of the farm." > "No, thanks. I like candy." > " "Ah, not vinegar." > prospects. Is that all plain?" > "Yes, yes; go on." > ? Don't know." > "Must be." these cases all needed to have a blank line inserted between their lines... *** another check is for _short_lines_ which are not followed by a blank line. you'll see that some of these cases are the same ones just returned above: > "What--about the farm?" > "Well, I wasn't just thinking of the farm." > She assumed an air of surprise. > "Why not, my child?" > prospects. Is that all plain?" > "Yes, yes; go on." > if possible." Then to Mrs. Mailing, > "May I?" > all. "But on, girl, on. There is more > to do yet" > "Don't know." > "Must be." > George Iredale a murderer!" > And Prudence, her anger evaporated as swiftly as > sagely. > And she was as good as her word. She had not sometimes -- in these d.p. files -- a short line is caused by the fact that the line above it had an end-line-hyphenate that was "clothed" by d.p., so this test often returns a few false-alarms. but since it also manages to locate the occasional bug as well -- for example, the "sagely" case -- it's generally worth doing anyway. *** other anomalies happen, and these checks can help find them... sometimes the o.c.r. misplaces a line totally: > with a sigh. > with a sigh. Then as an after-thought: "He seems *** these routines can also help you find indented passages: > Infallibly end in connubial strife-' > 'Maid who angers faithful swain > Will shed more tears and know mere pain > Than she who loves and loves in vain."' > "' Harvest your wheat ere the August frost; > One breath of cold and the crop is lost'" > 'Though the tempest of life will oft shut out the past, > The thoughts of our school-days remain to the last.'" > Dead'ning a mind to lofty thought for which by nature meant.1?? > "Nature designs all human ills, but in the making > 'Love feeds on kisses, we read in ancient lay; > Meaning the love of yore; not of to-day,"?? *** sometimes the bug is at the beginning of the line: > "that's my opinion," said Grey definitely. > "That's my opinion," said Grey definitely. > ?If------" > "If------" > -to do what I wanted," he resumed. "No > "--to do what I wanted," he resumed. "No > \Vho did you say?" asked Hervey, with a quick > "Who did you say?" asked Hervey, with a quick > ?And------" > "And------" > --And besides------" > "--And besides------" > 'Tain't the prairie," he muttered. "Too thick. > "'Tain't the prairie," he muttered. "Too thick. and sometimes the bug is at the end of the line: > I know. I had hoped that it was madness. > I know. I had hoped that it was madness. There > how, in the paroxysm of her grief, she had. > how, in the paroxysm of her grief, she had and sometimes there's a bug at the beginning and the end: > u Sakes alive, girl, yes. It's the way you have said. > "Sakes alive, girl, yes. It's the way you have said *** finally, let's look at the paragraphs around pagebreaks. as discussed earlier, the routine that rfrank uses overstates paragraphs found on pagebreaks, since it only looks at the first line of the new page. this means he had 35 false-alarms of new paragraphs that were _not_: > "file 015" = The Indian served out the pork with ruthless hands. > "file 016" = Followed the track of a dog-train. It came some > "file 020" = Straight before him with eyes that seemed to draw > "file 025" = Wake--no." And he finished up with a shake of the > "file 058" = Then as an after-thought: ?" He seems > "file 066" = Grey broke off abruptly. Darkness hid the angry > "file 067" = Iredale and myself in the house," Grey went on > "file 091" = Did you ask any one's advice when you married > "file 114" = "It's business, you know. Besides, it won't take > "file 124" = They all too cordially agreed with her to defend the > "file 127" = The object was a sleigh. And the speed at which It > "file 134" = Prudence as though some barrier had suddenly shut > "file 151" = Even the doting affection of his mother had not > "file 152" = Prudence, nor was he slow to appreciate the possl? > "file 154" = Iredale, who was standing in his doorway when he > "file 184" = "A light," sfee said. "?That must be the ranch. Quick, > "file 199" = Then it drew back sharply with its little upstanding > "file 201" = Hervey was about to follow, but a strange sound > "file 205" = George Iredale's wealth. The despicable methods > "file 211" = Prudence Mailing, or he must marry her, and break from > "file 215" = He realized all that a lover may realize of his own > "file 220" = She had ceased work to greet him, but she did not > "file 221" = The man watched the nimble fingers intently as they > "file 240" = That was a positive stroke of genius of yours in > "file 244" = "God! but if you stay here an instant longer, I'll > "file 258" = The proving of his charge was a matter which would > "file 259" = A STAB IN THE DARK 253 > "file 267" = The man waited. He did not wish to hurry her. He > "file 275" = She had seen with her own eyes the doings at the > "file 278" = He would then go unpunished. Leslie's death would > "file 283" = Miss Thoughtless! Don't keep him there a-philandering > "file 302" = She drank in his words with a soul-consuming thirst > "file 305" = Suddenly she released herself and moved to arm's > "file 310" = It was merely a grain station for the district and in > "file 342" = He grinned over at Iredale. (for convenience in viewing the o.c.r. output, i give the warped d.p. numbers.) *** in actuality, there were 105 cases of sentence-ending termination on the previous page linked with a first-line capitalization on the current page... and only 89 of those were actually _new_ paragraphs upon examination. the 16 which were not are listed here, in case you'd like to confirm them: > http://z-m-l.com/go/hound/houndp009.html > The Indian served out the pork with ruthless hands. > http://z-m-l.com/go/hound/houndp010.html > Followed the track of a dog-train. It came some > http://z-m-l.com/go/hound/houndp019.html > Wake--no." And he finished up with a shake of the > http://z-m-l.com/go/hound/houndp085.html > Did you ask any one's advice when you married > http://z-m-l.com/go/hound/houndp108.html > "It's business, you know. Besides, it won't take > http://z-m-l.com/go/hound/houndp145.html > Even the doting affection of his mother had not > http://z-m-l.com/go/hound/houndp178.html > "A light," sfee said. "That must be the ranch. Quick, > http://z-m-l.com/go/hound/houndp193.html > Then it drew back sharply with its little upstanding > http://z-m-l.com/go/hound/houndp209.html > He realized all that a lover may realize of his own > http://z-m-l.com/go/hound/houndp215.html > The man watched the nimble fingers intently as they > http://z-m-l.com/go/hound/houndp252.html > The proving of his charge was a matter which would > http://z-m-l.com/go/hound/houndp261.html > The man waited. He did not wish to hurry her. He > http://z-m-l.com/go/hound/houndp269.html > She had seen with her own eyes the doings at the > http://z-m-l.com/go/hound/houndp272.html > He would then go unpunished. Leslie's death would > http://z-m-l.com/go/hound/houndp296.html > She drank in his words with a soul-consuming thirst > http://z-m-l.com/go/hound/houndp299.html > Suddenly she released herself and moved to arm's *** whew! :+) it's funny, because it takes hours and hours to write this stuff up, whereas it takes a _much_ shorter time to actually _do_ the work. it wouldn't surprise me if it takes more time to _read_ the write-up than to actually do the work. so if you read all of this, good for you! *** so there you go, a first pass at cleaning the o.c.r. for this whole book, one that concerned itself solely with getting the paragraphing correct, the paragraph-terminators correct, and the casing of line-starts correct. i didn't add 'em up, but it's clear there are literally hundreds of errors. all of which could have been found and quickly fixed before this text went out to the valuable volunteers over at distributed proofreaders, saving them a _boatload_ of time and energy finding and fixing bugs, meaning that the "final seal of approval" would've come _much_ faster, because instead of doing 3 rounds of proofing, they could've done 1... *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... 2009-01-12. seek out external validation, when it's possible... and now we extend the lessons from part 2 to the whole book: 2009-01-14. rename files and change separators, if necessary. 2009-01-15. fix the paragraphing, including the terminators... 2009-01-16. fix the paragraphing around the pagebreaks too. see you tomorrow! -bowerbird p.s. the list of 200+ probable cases of incorrect paragraph termination: 001> something that crackled slightly under his weight 002> seemed to think well before he answered. Then-- 003> over and roused the traveller, 004> lost the dog track" 005> "Best not bother to smoke to-night" 006> intensity-- 007> iron of the stove. Thus he sat for an hour, looking 008> the trail------" 009> ?If------" 010> sight 011> journey down country. That's the dog's part 012> cent of it. You------" 013> "If that were possible I guess we ought to make 014> the primest bacon. Hallo, here comes the d------d 015> Wake--no." And he finished up with a shake of the 016> The Indian shook his head, 017> Guess he can fix up in that till this d------d breeze 018> place for depositing------" 019> "And reams of' returns.'" 020> breakfast 021> a smile. "No, I won't experiment" 022> I knew these regions well enough, but I didn't I 023> life------" 024> I had just enough------" 025> now hanging forward and his chin rested on his chest 026> -to do what I wanted," he resumed. "No 027> moment 028> left or right 029> would------ 030> for food and rest 031> "The d------d cur seems to know the range of a 032> it 033> a bitter laugh shocked the silence of the snow-bound 034> travel with me you'd best come along, and be d------d 035> more significant event in her life even than when she 036> Hephzibah was not a woman to set her affections 037> Then as an after-thought: " He seems 038> "But I want------" 039> the paper and read in a solemn whisper-- 040> became all interest 041> something like this-- 042> Infallibly end in connubial strife-' 043> said roughly, 044> It's not right to ask him when I am here, besides------" 045> "Anyway I don't think there is room for both 046> The school-ma'am whispered impressively-- 047> Than she who loves and loves in vain."' 048> have stayed to tea. The party begins at seven, don't 049> "I have some important work to do------" 050> the door. "Aren't you------" 051> profoundest tone-- 052> One breath of cold and the crop is lost'" 053> condition of his mind demanded nothing less than a 054> of inquiry than of making a statement 055> second as though in doubt; then his voice reached the 056> like a timber-wolf. Send him out" 057> Prudence's wedding------" 058> too, joined in the enthusiasm of the moment 059> The thoughts of our school-days remain to the last.'" 060> fruit farm is a failure and I am trying to sell it" 061> observing Sarah had been sure of it 062> hired girl-- 063> accent 064> was gentle and caressing. Hervey suddenly called to 065> "Don't go near him. He's as treacherous as a 066> revel with the best 067> Furrers. Daisy, Fortune, and Rachel, three girls of 068> fun began from the very first 069> like--Timothy grass. Stand still while I fix it" 070> any notions. He just said he was going to marry 071> me--and he did marry me. I was all sort of swept 072> off my feet" 073> there------" 074> "Then he's a fool. But you try him," Iredale said 075> when I've got a lot to think about" 076> "Thanks. I hardly expected it" 077> far. You shall apologize or------" 078> make the list of your accusations complete before------" 079> have said, but------" 080> "Shut up I" 081> replaced by a look of keen earnestness, 082> anybody else. I've seen too much," 083> His companion looked up with a violent start 084> eyes that fixed me. You remember those two great, 085> deep-sunken, cow-eyes of his------" Robb broke off as 086> Then he went on reflectively: " But no, it couldn't 087> have been he. Ah------" He broke off and glanced 088> Robb, relinquishing his hold on the cutter's rail 089> His right arm was raised and his hand gripped a 090> No one replied to the old lady's heated complaint 091> snow. Hope rose at a bound to wild, eager delight 092> "You don't say------" Mrs. Mailing gasped; it was 093> "It can't-----" The minister got no further, and 094> "It's-----" some one said and broke off. Then 095> He was obeyed implicitly. But his order carne 096> Only Hephzibah Mailing stood her ground. Her 097> She uttered no sound. All her anger against the 098> prostrate man had vanished; a world of pity was in 099> happened slowly forced itself upon his mind 100> Grey's chest 101> her brain in an unchecked torrent It seen ^1 to 102> face, spoke sharply. She voiced a common thought 103> "He--he--did--it. Free--P--Press. Yell--ow-- 104> G-----" The last word died away to a gurgle. A 105> leaving it 106> encouragement-- 107> imminent 108> The girl felt angry still, but Hervey's tone slightly 109> recognized the fact that Iredale was in love with 110> ?aw the ranch through the trees, and he greeted 111> "Well," he said, after shaking his host by the 112> hand, "another mile of this d------d valley and I 113> expansively-- 114> overwhelming. If I can--er--be------" 115> security on my certain inheritance of the farm------" 116> Iredale spoke with such indifference about the 117> passed it across the table. His only remark was-- 118> "Curse the man for his d------d superiority," he 119> so easy to get lost" 120> "Of course; winter would be different, wouldn't 121> object. Besides, I have not turned poacher yet" 122> "Nothing of the sort" 123> callous significance--Nothing I 124> Alice gazed at the other curiously. Then-- 125> then--and then--" 126> me before. I must prove myself to myself before-- 127> before------" She broke off, only to resume again with 128> my conscience. I could never trust myself. George 129> may love me now; I believe I love him, but------No, 130> protest-- 131> was resolved to follow it 132> "Are------Oh, come away, I can't stand it" 133> with pleasure. Prudence answered at once-- 134> and turned his great eyes upon his sister 135> "One of these days I'll give you something to 136> snarl at, you d-----d cur," Hervey said, between his 137> he leant upon one of the fence posts and looked 138> from the hut There was something very like the 139> grave and lay bare the mouldering bones it contained 140> to little yelps of eagerness as he went 141> breakfast 142> own interests, and he revelled in the thought of 143> The sudden appearance of the light was the signal 144> word muttered below his breath voiced his discovery- 145> narrow volume which bore on the cover the legend-- 146> things. Either he must renounce all thoughts of 147> "Mv DEAR MR. IREDALE," 148> "Yours truly, 149> man.'" 150> with an impatient tug at the material on which 151> "Food for mental occupation," said Sarah, 152> Dead'ning a mind to lofty thought for which by nature meant.1* 153> Iredale moved over to where Prudence was sitting 154> stooped and caressed the great dog at her feet 155> very, very much, but------" 156> discover the wretch who did him to death------" 157> much a reality. I cannot marry you--until--until-----" 158> lightly. You would regret your decision later on, 159> and then------No, mother and I will see the matter 160> ears with maddening insistence-- 161> Truly his sins were finding him out 162> shuddered to think 163> make in contemplating a succulent dish, 164> was not lost upon his visitor. Then he went on-- 165> of such an ideal spot as this from which to operate 166> come to a definite arrangement" 167> here to-night" 168> "Out of my house, you scum!" Iredale roared 169> case and turned back to the stove. She 170> quite cheerfully-- 171> ?And------" 172> "Very well If you can convince me, it shall be as 173> "Yes--and you------" 174> ascertained the rest" 175> spotless sun-bonnet 176> the ranch and challenged George Iredale------" 177> authorship of that notice------" 178> the steady precision of the ticking-- 179> at the bureau, muttering-- 180> -And besides------" 181> law-breaker; at the worst he was------ 182> "Maybe she learned you, my girL" 183> Suggests the cure which best is for the taking.'" 184> fretting herself because------" 185> "This is real good. Bring him in I Bring him in, 186> Meaning the love of yore; not of to-day,"" 187> ladies present 188> was a pause. Then Mrs. Mailing broke it-- 189> to make Ainsley that night 190> "What is the creature's name? I didn't catch it" 191> 'Bully.'" 192> half-light 193> and----- 194> though she could strike him for his calm words. Hel 195> thought 196> Still, she replied coldly-- 197> might bring you--life-long regret" 198> for it* 199> There was the briefest of pauses; then she went on-- 200> It lay nearly thirty-five miles due west of Owl Hoot 201> the willing beast beyond a rational gait 202> told her what had happened. The forest was on fire I 203> Then came an after-thought-- 204> to do yet" 205> "River '$1 stop it" 206> "Iredale's ranch burnt out? * 207> "Pleasure." And the man read the message-- 208> "Telegram for you, sir; ' expressed."" 209> he charges------" 210> night 211> he broke off. The next moment he went on angrily-- 212> She paused. Hervey broke in-- 213> him fairly, he thought 214> them------" 215> "But------" Prudence rushed forward, but Sarah 216> even stronger than those attributed to------" 217> "Not a word to her about--about-----* 218> So does Robb," 219> fHK BUD ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Thu Jan 22 18:47:43 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 22 Jan 2009 21:47:43 EST Subject: [gutvol-d] obama inaugural speech transcript Message-ID: hey michael, do you want me to do an independent digitization of the obama inaugural speech transcript? i've compared the "official" version with the "as given" text, and found only a few meaningless differences e.g., a couple "the" and "and" adds. > All this we can do. [And] All this we will do. > [And] Those of us who manage the public's dollars will be held to account -- > And so to all [the] other peoples and governments who are watching today, > we say we can no longer afford indifference to [the] suffering outside our borders; > They have something to tell us [today], also check typo on "khe sanh", and capitalization of "scripture" and "mall". spell-check flags "expedience's" for me -- and suggests "expediency's" -- but since both versions had it written that way, i decided to leave it that way. my "official version" was #28001. i got my "as delivered" version on some news site somewhere, and will be willing to _verify_ it myself against a video of the speech _if_ you think it would be desirable... otherwise, i'd be satisfied with my current version as "good enough". -bowerbird p.s. might as well give it to you... ===================== Presidential Inaugural Address of Barack Obama on January 20, 2009. by Barack Obama My fellow citizens: I stand here today humbled by the task before us, grateful for the trust you have bestowed, mindful of the sacrifices borne by our ancestors. I thank President Bush for his service to our nation, as well as the generosity and cooperation he has shown throughout this transition. Forty-four Americans have now taken the presidential oath. The words have been spoken during rising tides of prosperity and the still waters of peace. Yet, every so often the oath is taken amidst gathering clouds and raging storms. At these moments, America has carried on not simply because of the skill or vision of those in high office, but because we the people have remained faithful to the ideals of our forebears, and true to our founding documents. So it has been. So it must be with this generation of Americans. That we are in the midst of crisis is now well understood. Our nation is at war, against a far-reaching network of violence and hatred. Our economy is badly weakened, a consequence of greed and irresponsibility on the part of some, but also our collective failure to make hard choices and prepare the nation for a new age. Homes have been lost; jobs shed; businesses shuttered. Our health care is too costly; our schools fail too many; and each day brings further evidence that the ways we use energy strengthen our adversaries and threaten our planet. These are the indicators of crisis, subject to data and statistics. Less measurable but no less profound is a sapping of confidence across our land -- a nagging fear that America's decline is inevitable, and that the next generation must lower its sights. Today I say to you that the challenges we face are real. They are serious and they are many. They will not be met easily or in a short span of time. But know this, America -- they will be met. On this day, we gather because we have chosen hope over fear, unity of purpose over conflict and discord. On this day, we come to proclaim an end to the petty grievances and false promises, the recriminations and worn out dogmas, that for far too long have strangled our politics. We remain a young nation, but in the words of scripture, the time has come to set aside childish things. The time has come to reaffirm our enduring spirit; to choose our better history; to carry forward that precious gift, that noble idea, passed on from generation to generation: the God-given promise that all are equal, all are free and all deserve a chance to pursue their full measure of happiness. In reaffirming the greatness of our nation, we understand that greatness is never a given. It must be earned. Our journey has never been one of shortcuts or settling for less. It has not been the path for the faint-hearted -- for those who prefer leisure over work, or seek only the pleasures of riches and fame. Rather, it has been the risk-takers, the doers, the makers of things -- some celebrated but more often men and women obscure in their labor, who have carried us up the long, rugged path towards prosperity and freedom. For us, they packed up their few worldly possessions and traveled across oceans in search of a new life. For us, they toiled in sweatshops and settled the West; endured the lash of the whip and plowed the hard earth. For us, they fought and died, in places like Concord and Gettysburg; Normandy and Khe Sanh. Time and again these men and women struggled and sacrificed and worked till their hands were raw so that we might live a better life. They saw America as bigger than the sum of our individual ambitions; greater than all the differences of birth or wealth or faction. This is the journey we continue today. We remain the most prosperous, powerful nation on Earth. Our workers are no less productive than when this crisis began. Our minds are no less inventive, our goods and services no less needed than they were last week or last month or last year. Our capacity remains undiminished. But our time of standing pat, of protecting narrow interests and putting off unpleasant decisions -- that time has surely passed. Starting today, we must pick ourselves up, dust ourselves off, and begin again the work of remaking America. For everywhere we look, there is work to be done. The state of the economy calls for action, bold and swift, and we will act -- not only to create new jobs, but to lay a new foundation for growth. We will build the roads and bridges, the electric grids and digital lines that feed our commerce and bind us together. We will restore science to its rightful place, and wield technology's wonders to raise health care's quality and lower its cost. We will harness the sun and the winds and the soil to fuel our cars and run our factories. And we will transform our schools and colleges and universities to meet the demands of a new age. All this we can do. And all this we will do. Now, there are some who question the scale of our ambitions -- who suggest that our system cannot tolerate too many big plans. Their memories are short. For they have forgotten what this country has already done; what free men and women can achieve when imagination is joined to common purpose, and necessity to courage. What the cynics fail to understand is that the ground has shifted beneath them -- that the stale political arguments that have consumed us for so long no longer apply. The question we ask today is not whether our government is too big or too small, but whether it works -- whether it helps families find jobs at a decent wage, care they can afford, a retirement that is dignified. Where the answer is yes, we intend to move forward. Where the answer is no, programs will end. And those of us who manage the public's dollars will be held to account -- to spend wisely, reform bad habits, and do our business in the light of day -- because only then can we restore the vital trust between a people and their government. Nor is the question before us whether the market is a force for good or ill. Its power to generate wealth and expand freedom is unmatched, but this crisis has reminded us that without a watchful eye, the market can spin out of control -- and that a nation cannot prosper long when it favors only the prosperous. The success of our economy has always depended not just on the size of our gross domestic product, but on the reach of our prosperity; on our ability to extend opportunity to every willing heart -- not out of charity, but because it is the surest route to our common good. As for our common defense, we reject as false the choice between our safety and our ideals. Our founding fathers, faced with perils we can scarcely imagine, drafted a charter to assure the rule of law and the rights of man, a charter expanded by the blood of generations. Those ideals still light the world, and we will not give them up for expedience's sake. And so to all the other peoples and governments who are watching today, from the grandest capitals to the small village where my father was born: know that America is a friend of each nation and every man, woman, and child who seeks a future of peace and dignity, and that we are ready to lead once more. Recall that earlier generations faced down fascism and communism not just with missiles and tanks, but with sturdy alliances and enduring convictions. They understood that our power alone cannot protect us, nor does it entitle us to do as we please. Instead, they knew that our power grows through its prudent use; our security emanates from the justness of our cause, the force of our example, the tempering qualities of humility and restraint. We are the keepers of this legacy. Guided by these principles once more, we can meet those new threats that demand even greater effort -- even greater cooperation and understanding between nations. We will begin to responsibly leave Iraq to its people, and forge a hard-earned peace in Afghanistan. With old friends and former foes, we will work tirelessly to lessen the nuclear threat, and roll back the specter of a warming planet. We will not apologize for our way of life, nor will we waver in its defense, and for those who seek to advance their aims by inducing terror and slaughtering innocents, we say to you now that our spirit is stronger and cannot be broken; you cannot outlast us, and we will defeat you. For we know that our patchwork heritage is a strength, not a weakness. We are a nation of Christians and Muslims, Jews and Hindus -- and non-believers. We are shaped by every language and culture, drawn from every end of this Earth; and because we have tasted the bitter swill of civil war and segregation, and emerged from that dark chapter stronger and more united, we cannot help but believe that the old hatreds shall someday pass; that the lines of tribe shall soon dissolve; that as the world grows smaller, our common humanity shall reveal itself; and that America must play its role in ushering in a new era of peace. To the Muslim world, we seek a new way forward, based on mutual interest and mutual respect. To those leaders around the globe who seek to sow conflict, or blame their society's ills on the West -- know that your people will judge you on what you can build, not what you destroy. To those who cling to power through corruption and deceit and the silencing of dissent, know that you are on the wrong side of history; but that we will extend a hand if you are willing to unclench your fist. To the people of poor nations, we pledge to work alongside you to make your farms flourish and let clean waters flow; to nourish starved bodies and feed hungry minds. And to those nations like ours that enjoy relative plenty, we say we can no longer afford indifference to the suffering outside our borders; nor can we consume the world's resources without regard to effect. For the world has changed, and we must change with it. As we consider the road that unfolds before us, we remember with humble gratitude those brave Americans who, at this very hour, patrol far-off deserts and distant mountains. They have something to tell us today, just as the fallen heroes who lie in Arlington whisper through the ages. We honor them not only because they are guardians of our liberty, but because they embody the spirit of service; a willingness to find meaning in something greater than themselves. And yet, at this moment -- a moment that will define a generation -- it is precisely this spirit that must inhabit us all. For as much as government can do and must do, it is ultimately the faith and determination of the American people upon which this nation relies. It is the kindness to take in a stranger when the levees break, the selflessness of workers who would rather cut their hours than see a friend lose their job which sees us through our darkest hours. It is the firefighter's courage to storm a stairway filled with smoke, but also a parent's willingness to nurture a child, that finally decides our fate. Our challenges may be new. The instruments with which we meet them may be new. But those values upon which our success depends -- hard work and honesty, courage and fair play, tolerance and curiosity, loyalty and patriotism -- these things are old. These things are true. They have been the quiet force of progress throughout our history. What is demanded then is a return to these truths. What is required of us now is a new era of responsibility -- a recognition, on the part of every American, that we have duties to ourselves, our nation, and the world, duties that we do not grudgingly accept but rather seize gladly, firm in the knowledge that there is nothing so satisfying to the spirit, so defining of our character, than giving our all to a difficult task. This is the price and the promise of citizenship. This is the source of our confidence -- the knowledge that God calls on us to shape an uncertain destiny. This is the meaning of our liberty and our creed -- why men and women and children of every race and every faith can join in celebration across this magnificent Mall, and why a man whose father less than sixty years ago might not have been served at a local restaurant can now stand before you to take a most sacred oath. So let us mark this day with remembrance, of who we are and how far we have traveled. In the year of America's birth, in the coldest of months, a small band of patriots huddled by dying campfires on the shores of an icy river. The capital was abandoned. The enemy was advancing. The snow was stained with blood. At a moment when the outcome of our revolution was most in doubt, the father of our nation ordered these words be read to the people: "Let it be told to the future world... that in the depth of winter, when nothing but hope and virtue could survive... that the city and the country, alarmed at one common danger, came forth to meet (it)." America, in the face of our common dangers, in this winter of our hardship, let us remember these timeless words. With hope and virtue, let us brave once more the icy currents, and endure what storms may come. Let it be said by our children's children that when we were tested we refused to let this journey end, that we did not turn back nor did we falter; and with eyes fixed on the horizon and God's grace upon us, we carried forth that great gift of freedom and delivered it safely to future generations. Thank you. God bless you. And God bless the United States of America. ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Bowerbird at aol.com Fri Jan 23 13:56:21 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 23 Jan 2009 16:56:21 EST Subject: [gutvol-d] how to clean o.c.r. -- 2009-01-17 Message-ID: this is the january 2009 series on "how to clean o.c.r.". we're using the book "the hound from the north", available at d.p. the book was previously split into 3 parts, but is now consolidated: > http://www.pgdp.net/c/project.php?id=projectID494113af10e06 *** 2009-01-17. compile a list of the proper nouns in the book... i detailed the process of compiling the list of names in the book when we were discussing part 2, so there's no need to repeat it... i have appended the list that was obtained from the whole book. *** here's the summary of our lessons: 2009-01-01. rename files and change separators, if necessary. 2009-01-02. fix the paragraphing, including the terminators... 2009-01-03. fix the paragraphing around the pagebreaks too. 2009-01-04. compile a list of the proper nouns in the book... 2009-01-05. check out the sentence-terminators in the book... 2009-01-06. examine any cases of numbers within the text... 2009-01-07. check for bad or atypical punctuation patterns... 2009-01-08. fix the spacey-quotes, both double and single... 2009-01-09. fix dashes, em-dashes, and hyphenation issues... 2009-01-10. check for, and fix, any wierdnesses that appear... 2009-01-11. do spellcheck, with the book-specific dictionary... 2009-01-12. seek out external validation, when it's possible... and now we extend the lessons from part 2 to the whole book: 2009-01-14. rename files and change separators, if necessary. 2009-01-15. fix the paragraphing, including the terminators... 2009-01-16. fix the paragraphing around the pagebreaks too. 2009-01-17. compile a list of the proper nouns in the book... see you tomorrow! -bowerbird > A Stab In The Dark > A.L. Burt Company > Ainsley > Al > Alaskan > Alice > Alice Gordon > American > An Echo From The Alaskan Mountains > Andy > August > Aunt Sarah > Author Of > Badlands > Brooding Wild > Bully > By Jove > Canada > Canadian > Canine Vagaries > Canuk > Chapter I > Chapter II > Chapter III > Chapter IV > Chapter IX > Chapter V > Chapter VI > Chapter VII > Chapter VIII > Chapter X > Chapter XI > Chapter XII > Chapter XIII > Chapter XIV > Chapter XIX > Chapter XV > Chapter XVI > Chapter XVII > Chapter XVIII > Charles Livingston Bull > Chillingwood > Chinese > Chintz > Church > Covill > Covills > Cree Indian > Customs > Dakota > Damside > Damside City > Danvers > Deane > Dominion > Dominion Ranch > Doomsday > Dougal > Dr. Parash > Duffield > Dyke > East > Eldorado > Emma > Eskimos > Fate > Fort Cudahy > Fort Garry > Forty Mile Creek > Foss River Ranch > Free Press > Front Hill > Furrer > Furrers > Ganthorn > Ganthorns > George > George Iredale > German > Gleichen > God > Gordon > Gordon Duffield > Government > Grey > Grey's Last Words > Grieg > Gurridge > Harry > Harry Gleichen > Haunted Hill > Hephzibah > Hephzibah Malling > Hephzy > Hervey > Hervey Malling > Hill > Holy Orders > Hoot > Hotel > Hound From The North > I.O.U.s > In Conclusion > In The Mountains > Indian > Indians > Iredale > Jack Broad > John > Kitty > Kootenai > L.C. Page > Lake > Lakeville > Law Breakers > Le Mar > Leonville > Leslie > Leslie Grey > Leslie Grey Fulfils His Destiny > Lonely Ranch > Lonely Ranch At Owl Hoot > Loon Dyke > Loon Dyke Farm > Lord > Lord Harry > Mackinaw > Manitoba > Mary > Master Hervey > Master Robb > Methodist > Mile Creek > Minister > Minnesota > Miss > Miss Covills > Miss Ganthorn > Miss Malling > Mistress Prudence > Molly > Monday > Mother > Mr. > Mr. Chillingwood > Mr. Danvers > Mr. Frances > Mr. George Iredale > Mr. Grey > Mr. Iredale > Mr. Leslie Grey > Mr. Malling > Mr. Robb Chillingwood > Mr. Smith > Mr. Zachary Smith > Mr. Zachary Smith Smokes > Mrs. > Mrs. Covill > Mrs. Ganthorn > Mrs. George Iredale > Mrs. Gurridge > Mrs. Malling > Nature > Neche > New York > Niagara > North-West Territories > Northern Union Hotel > Old Country > Owl > Owl Hoot > Owl Hoot Valley > Parash > Pass > Peter > Peter Furrer > Police > Progressive Euchre > Prudence > Prudence Malling > Prue > Publishers > Rainy-Moon > Rev. > Rev. Charles Danvers > Ridgwell Cullum > Ridley > Robb > Robb Chillingwood > Rockies > Rocky Mountains > Rodney House > Rodney House Hotel > Rosebank > Rosy > Rosy River > Sarah > Sarah Gurridge > Scot > Scotia > Scotland > Shire > Silas > Slump In Grey > Smith > Southern Manitoba > Spartan-like > St. John's > St. John's University > States > Stetson > Sutton > The Avenger > The Breaking Of The Storm > The Eskimo > The Forest Demon Pursues > The Graveyard At Owl Hoot > The Hound From The North > The Indian > The Last Of Lonely Ranch > The Lord > The Maggot At The Core > The Mallings > The Pass > The Progressive Euchre Party > The Return Of The Prodigal > The Rev. Charles Danvers > Tim > Tim Gleichen > Timothy > Toronto > Toronto Globe > Town Clerkship > Town Hall > United States > University > Verdon > West > Winnipeg > Winnipeg Free Press > Winter > With Frontispiece > Woods > Yellow Grey > Yukon > Yukon Valley > Zachary Smith ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From debook2164 at hotmail.com Fri Jan 23 15:06:29 2009 From: debook2164 at hotmail.com (David Edwards) Date: Fri, 23 Jan 2009 17:06:29 -0600 Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: Message-ID: The capitalization of Mall is correct. 'The Mall' the proper name of the area from the Capital building to the Lincoln Memorial and bordered by the various museums, The White House and other memorials. DAvid EMAILING FOR THE GREATER GOODJoin me From: Bowerbird at aol.comDate: Thu, 22 Jan 2009 21:47:43 -0500To: gutvol-d at lists.pglaf.org; Bowerbird at aol.comSubject: [gutvol-d] obama inaugural speech transcripthey michael, do you want me to doan independent digitization of theobama inaugural speech transcript?i've compared the "official" versionwith the "as given" text, and foundonly a few meaningless differencese.g., a couple "the" and "and" adds.> All this we can do. [And] All this we will do.> [And] Those of us who manage the public's dollars will be held to account --> And so to all [the] other peoples and governments who are watching today,> we say we can no longer afford indifference to [the] suffering outside our borders;> They have something to tell us [today],also check typo on "khe sanh", and capitalization of "scripture" and "mall".spell-check flags "expedience's" for me -- and suggests "expediency's" --but since both versions had it written that way, i decided to leave it that way.my "official version" was #28001.i got my "as delivered" version onsome news site somewhere, andwill be willing to _verify_ it myselfagainst a video of the speech _if_you think it would be desirable...otherwise, i'd be satisfied with mycurrent version as "good enough".-bowerbirdp.s. might as well give it to you...=====================Presidential Inaugural Addressof Barack Obamaon January 20, 2009.by Barack ObamaMy fellow citizens:I stand here today humbled by the task before us,grateful for the trust you have bestowed,mindful of the sacrifices borne by our ancestors.I thank President Bush for his service to our nation,as well as the generosity and cooperation he hasshown throughout this transition.Forty-four Americans have now taken the presidential oath.The words have been spoken during rising tides of prosperityand the still waters of peace. Yet, every so often the oath is takenamidst gathering clouds and raging storms. At these moments,America has carried on not simply because of the skill or visionof those in high office, but because we the people have remainedfaithful to the ideals of our forebears, and true to our founding documents.So it has been. So it must be with this generation of Americans.That we are in the midst of crisis is now well understood.Our nation is at war, against a far-reaching network of violence and hatred.Our economy is badly weakened, a consequence ofgreed and irresponsibility on the part of some,but also our collective failure to make hard choicesand prepare the nation for a new age.Homes have been lost; jobs shed; businesses shuttered.Our health care is too costly; our schools fail too many; and each day brings further evidence that the ways we use energystrengthen our adversaries and threaten our planet.These are the indicators of crisis, subject to data and statistics.Less measurable but no less profound is a sapping of confidenceacross our land -- a nagging fear that America's decline is inevitable,and that the next generation must lower its sights.Today I say to you that the challenges we face are real.They are serious and they are many.They will not be met easily or in a short span of time.But know this, America -- they will be met.On this day, we gather because we have chosenhope over fear,unity of purpose over conflict and discord.On this day, we come to proclaim an end to the petty grievancesand false promises, the recriminations and worn out dogmas,that for far too long have strangled our politics.We remain a young nation, but in the words of scripture,the time has come to set aside childish things.The time has come to reaffirm our enduring spirit;to choose our better history; to carry forward that precious gift,that noble idea, passed on from generation to generation:the God-given promise that all are equal, all are free andall deserve a chance to pursue their full measure of happiness.In reaffirming the greatness of our nation,we understand that greatness is never a given. It must be earned.Our journey has never been one of shortcuts or settling for less.It has not been the path for the faint-hearted --for those who prefer leisure over work,or seek only the pleasures of riches and fame.Rather, it has been the risk-takers, the doers, the makers of things --some celebrated but more often men and women obscure in their labor,who have carried us up the long, rugged path towards prosperity and freedom.For us, they packed up their few worldly possessionsand traveled across oceans in search of a new life.For us, they toiled in sweatshops and settled the West;endured the lash of the whip and plowed the hard earth.For us, they fought and died, in places likeConcord and Gettysburg; Normandy and Khe Sanh.Time and again these men and women struggled and sacrificed andworked till their hands were raw so that we might live a better life.They saw America as bigger than the sum of our individual ambitions;greater than all the differences of birth or wealth or faction.This is the journey we continue today.We remain the most prosperous, powerful nation on Earth.Our workers are no less productive than when this crisis began.Our minds are no less inventive, our goods and services no less neededthan they were last week or last month or last year.Our capacity remains undiminished.But our time of standing pat, of protecting narrow interests andputting off unpleasant decisions -- that time has surely passed.Starting today, we must pick ourselves up, dust ourselves off,and begin again the work of remaking America.For everywhere we look, there is work to be done.The state of the economy calls for action, bold and swift,and we will act -- not only to create new jobs,but to lay a new foundation for growth.We will build the roads and bridges, the electric grids and digital linesthat feed our commerce and bind us together.We will restore science to its rightful place, and wield technology's wondersto raise health care's quality and lower its cost.We will harness the sun and the winds and the soilto fuel our cars and run our factories.And we will transform our schools and colleges and universitiesto meet the demands of a new age.All this we can do.And all this we will do.Now, there are some who question the scale of our ambitions --who suggest that our system cannot tolerate too many big plans.Their memories are short.For they have forgotten what this country has already done;what free men and women can achievewhen imagination is joined to common purpose, and necessity to courage.What the cynics fail to understand is thatthe ground has shifted beneath them --that the stale political arguments thathave consumed us for so long no longer apply.The question we ask today isnot whether our government is too big or too small,but whether it works -- whether it helps families findjobs at a decent wage, care they can afford, a retirement that is dignified.Where the answer is yes, we intend to move forward.Where the answer is no, programs will end.And those of us who manage the public's dollars will be held to account --to spend wisely, reform bad habits, and do our business in the light of day --because only then can we restore the vital trustbetween a people and their government.Nor is the question before us whether the market is a force for good or ill.Its power to generate wealth and expand freedom is unmatched,but this crisis has reminded us that without a watchful eye,the market can spin out of control --and that a nation cannot prosper long when it favors only the prosperous.The success of our economy has always dependednot just on the size of our gross domestic product,but on the reach of our prosperity;on our ability to extend opportunity to every willing heart --not out of charity, but because it is the surest route to our common good.As for our common defense,we reject as false the choice between our safety and our ideals.Our founding fathers, faced with perils we can scarcely imagine,drafted a charter to assure the rule of law and the rights of man,a charter expanded by the blood of generations.Those ideals still light the world,and we will not give them up for expedience's sake.And so to all the other peoples and governments who are watching today,from the grandest capitals to the small village where my father was born:know that America is a friend of each nation andevery man, woman, and child who seeks a future of peace and dignity,and that we areready to lead once more.Recall that earlier generations faced down fascism and communismnot just with missiles and tanks, butwith sturdy alliances and enduring convictions.They understood that our power alone cannot protect us,nor does it entitle us to do as we please.Instead, they knew that our power grows through its prudent use;our security emanates from the justness of our cause,the force of our example, the tempering qualities of humility and restraint.We are the keepers of this legacy. Guided by these principles once more,we can meet those new threats that demand even greater effort --even greater cooperation and understanding between nations.We will begin to responsibly leave Iraq to its people,and forge a hard-earned peace in Afghanistan.With old friends and former foes,we will work tirelessly to lessen the nuclear threat,and roll back the specter of a warming planet.We will not apologize for our way of life, nor will we waver in its defense,and for those who seek to advance their aimsby inducing terror and slaughtering innocents,we say to you now that our spirit is stronger and cannot be broken;you cannot outlast us,and we will defeat you.For we know that our patchwork heritage is a strength, not a weakness.We are a nation of Christians and Muslims, Jews and Hindus -- and non-believers.We are shaped by every language and culture, drawn from every end of this Earth;and because we have tasted the bitter swill of civil war and segregation,and emerged from that dark chapter stronger and more united,we cannot help but believe that the old hatreds shall someday pass;that the lines of tribe shall soon dissolve;that as the world grows smaller, our common humanity shall reveal itself;and that America must play its role in ushering in a new era of peace.To the Muslim world, we seek a new way forward,based on mutual interest and mutual respect.To those leaders around the globe who seek to sow conflict,or blame their society's ills on the West -- know thatyour people will judge you on what you can build, not what you destroy.To those who cling to power throughcorruption and deceit and the silencing of dissent,know that you are on the wrong side of history; but that we will extend a hand if you are willing to unclench your fist.To the people of poor nations, we pledge to work alongside youto make your farms flourish and let clean waters flow;to nourish starved bodies and feed hungry minds.And to those nations like ours that enjoy relative plenty,we say we can no longer afford indifference to the suffering outside our borders;nor can we consume the world's resources without regard to effect.For the world has changed,and we must change with it.As we consider the road that unfolds before us,we remember with humble gratitude those brave Americans who,at this very hour, patrol far-off deserts and distant mountains.They have something to tell us today,just as the fallen heroes who lie in Arlington whisper through the ages.We honor them not only because they are guardians of our liberty,but because they embody the spirit of service;a willingness to find meaning in something greater than themselves.And yet, at this moment -- a moment that will define a generation --it is precisely this spiritthat must inhabit us all.For as much as government can do and must do,it is ultimately the faith and determination of the American peopleupon which this nation relies.It is the kindness to take in a stranger when the levees break,the selflessness of workers who would rathercut their hours than see a friend lose their jobwhich sees us through our darkest hours.It is the firefighter's courage to storm a stairway filled with smoke,but also a parent's willingness to nurture a child,that finally decides our fate.Our challenges may be new.The instruments with which we meet them may be new.But those values upon which our success depends --hard work and honesty, courage and fair play,tolerance and curiosity, loyalty and patriotism --these things are old. These things are true.They have been the quiet force of progress throughout our history.What is demanded then is a return to these truths.What is required of us now is a new era of responsibility --a recognition, on the part of every American, thatwe have duties to ourselves, our nation, and the world,duties that we do not grudgingly accept but rather seize gladly,firm in the knowledge that there is nothing so satisfying to the spirit,so defining of our character,than giving our all to a difficult task.This is the price and the promise of citizenship.This is the source of our confidence --the knowledge that God calls on us to shape an uncertain destiny.This is the meaning of our liberty and our creed --why men and women and children of every race and every faithcan join in celebration across this magnificent Mall,and why a man whose father less than sixty years agomight not have been served at a local restaurant cannow stand before you to take a most sacred oath.So let us mark this day with remembrance,of who we are and how far we have traveled.In the year of America's birth, in the coldest of months,a small band of patriots huddled bydying campfires on the shores of an icy river.The capital was abandoned. The enemy was advancing.The snow was stained with blood.At a moment when the outcome of our revolution was most in doubt,the father of our nation ordered these words be read to the people:"Let it be told to the future world... that in the depth of winter,when nothing but hope and virtue could survive...that the city and the country, alarmed at one common danger,came forth to meet (it)."America, in the face of our common dangers, in this winter of our hardship,let us remember these timeless words. With hope and virtue,let us brave once more the icy currents, and endure what storms may come.Let it be said by our children's children that when we were testedwe refused to let this journey end, that we did not turn back nor did we falter;and with eyes fixed on the horizon and God's grace upon us,we carried forth that great gift of freedomand delivered it safely to future generations.Thank you. God bless you. And God bless the United States of America.**************A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=DecemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hart at pglaf.org Fri Jan 23 15:21:08 2009 From: hart at pglaf.org (Michael Hart) Date: Fri, 23 Jan 2009 15:21:08 -0800 (PST) Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: Message-ID: 1. As I recall, it was "this. . .mall". . .as opposed the "The Mall," which would make it jst one of many malls. 2. "The Mall," in caps, is usualy something used by Washington people more than the rest f the ountr for /that/ mall. 3. I wonder how many people on how many college turfs say "The Quad/Quadrangle" in caps? I did notice that the official transcript used caps, but I will not change MY transcript to go along with Washington insiders-- of which I most certain am not one. While I expected both New Yorker and Washingtonians to do caps, I noticed that "The New York Times" transcript, which was quite obviously transcribed internally, did NOT use caps, which leads me to believe this is, indeed, a matter of local choice. Where I came from, there is "The Mountain". . .no one calls the thing by name, but the real natives always capitalize it. I would never tell anyone else to do so, however. In this case I feel the same way. . .each transcript should say what the person hearing the speech heard. . .punctuation, too. However, if you want to chastise "The New York Times" for this, be my guest, I'd love to see the results. . . . On Fri, 23 Jan 2009, David Edwards wrote: > > The capitalization of Mall is correct. 'The Mall' the proper name > of the area from the Capital building to the Lincoln Memorial and > bordered by the various museums, The White House and other > memorials. > > DAvid > > > > > EMAILING FOR THE GREATER GOODJoin me > > From: Bowerbird at aol.comDate: Thu, 22 Jan 2009 21:47:43 -0500To: > gutvol-d at lists.pglaf.org; Bowerbird at aol.comSubject: [gutvol-d] > obama inaugural speech transcripthey michael, do you want me to > doan independent digitization of theobama inaugural speech > transcript?i've compared the "official" versionwith the "as given" > text, and foundonly a few meaningless differencese.g., a couple > "the" and "and" adds.> All this we can do. [And] All this we will > do.> [And] Those of us who manage the public's dollars will be > held to account --> And so to all [the] other peoples and > governments who are watching today,> we say we can no longer > afford indifference to [the] suffering outside our borders;> They > have something to tell us [today],also check typo on "khe sanh", > and capitalization of "scripture" and "mall".spell-check flags > "expedience's" for me -- and suggests "expediency's" --but since > both versions had it written that way, i decided to leave it that > way.my "official version" was #28001.i got my "as delivered" > version onsome news site somewhere, andwill be willing to _verify_ > it myselfagainst a video of the speech _if_you think it would be > desirable...otherwise, i'd be satisfied with mycurrent version as > "good enough".-bowerbirdp.s. might as well give it to > you...=====================Presidential Inaugural Addressof Barack > Obamaon January 20, 2009.by Barack ObamaMy fellow citizens:I stand > here today humbled by the task before us,grateful for the trust > you have bestowed,mindful of the sacrifices borne by our > ancestors.I thank President Bush for his service to our nation,as > well as the generosity and cooperation he hasshown throughout this > transition.Forty-four Americans have now taken the presidential > oath.The words have been spoken during rising tides of > prosperityand the still waters of peace. Yet, every so often the > oath is takenamidst gathering clouds and raging storms. At these > moments,America has carried on not simply because of the skill or > visionof those in high office, but because we the people have > remainedfaithful to the ideals of our forebears, and true to our > founding documents.So it has been. So it must be with this > generation of Americans.That we are in the midst of crisis is now > well understood.Our nation is at war, against a far-reaching > network of violence and hatred.Our economy is badly weakened, a > consequence ofgreed and irresponsibility on the part of some,but > also our collective failure to make hard choicesand prepare the > nation for a new age.Homes have been lost; jobs shed; businesses > shuttered.Our health care is too costly; our schools fail too > many; and each day brings further evidence that the ways we use > energystrengthen our adversaries and threaten our planet.These are > the indicators of crisis, subject to data and statistics.Less > measurable but no less profound is a sapping of confidenceacross > our land -- a nagging fear that America's decline is > inevitable,and that the next generation must lower its > sights.Today I say to you that the challenges we face are > real.They are serious and they are many.They will not be met > easily or in a short span of time.But know this, America -- they > will be met.On this day, we gather because we have chosenhope over > fear,unity of purpose over conflict and discord.On this day, we > come to proclaim an end to the petty grievancesand false promises, > the recriminations and worn out dogmas,that for far too long have > strangled our politics.We remain a young nation, but in the words > of scripture,the time has come to set aside childish things.The > time has come to reaffirm our enduring spirit;to choose our better > history; to carry forward that precious gift,that noble idea, > passed on from generation to generation:the God-given promise that > all are equal, all are free andall deserve a chance to pursue > their full measure of happiness.In reaffirming the greatness of > our nation,we understand that greatness is never a given. It must > be earned.Our journey has never been one of shortcuts or settling > for less.It has not been the path for the faint-hearted --for > those who prefer leisure over work,or seek only the pleasures of > riches and fame.Rather, it has been the risk-takers, the doers, > the makers of things --some celebrated but more often men and > women obscure in their labor,who have carried us up the long, > rugged path towards prosperity and freedom.For us, they packed up > their few worldly possessionsand traveled across oceans in search > of a new life.For us, they toiled in sweatshops and settled the > West;endured the lash of the whip and plowed the hard earth.For > us, they fought and died, in places likeConcord and Gettysburg; > Normandy and Khe Sanh.Time and again these men and women struggled > and sacrificed andworked till their hands were raw so that we > might live a better life.They saw America as bigger than the sum > of our individual ambitions;greater than all the differences of > birth or wealth or faction.This is the journey we continue > today.We remain the most prosperous, powerful nation on Earth.Our > workers are no less productive than when this crisis began.Our > minds are no less inventive, our goods and services no less > neededthan they were last week or last month or last year.Our > capacity remains undiminished.But our time of standing pat, of > protecting narrow interests andputting off unpleasant decisions -- > that time has surely passed.Starting today, we must pick ourselves > up, dust ourselves off,and begin again the work of remaking > America.For everywhere we look, there is work to be done.The state > of the economy calls for action, bold and swift,and we will act -- > not only to create new jobs,but to lay a new foundation for > growth.We will build the roads and bridges, the electric grids and > digital linesthat feed our commerce and bind us together.We will > restore science to its rightful place, and wield technology's > wondersto raise health care's quality and lower its cost.We will > harness the sun and the winds and the soilto fuel our cars and run > our factories.And we will transform our schools and colleges and > universitiesto meet the demands of a new age.All this we can > do.And all this we will do.Now, there are some who question the > scale of our ambitions --who suggest that our system cannot > tolerate too many big plans.Their memories are short.For they have > forgotten what this country has already done;what free men and > women can achievewhen imagination is joined to common purpose, and > necessity to courage.What the cynics fail to understand is thatthe > ground has shifted beneath them --that the stale political > arguments thathave consumed us for so long no longer apply.The > question we ask today isnot whether our government is too big or > too small,but whether it works -- whether it helps families > findjobs at a decent wage, care they can afford, a retirement that > is dignified.Where the answer is yes, we intend to move > forward.Where the answer is no, programs will end.And those of us > who manage the public's dollars will be held to account --to spend > wisely, reform bad habits, and do our business in the light of day > --because only then can we restore the vital trustbetween a people > and their government.Nor is the question before us whether the > market is a force for good or ill.Its power to generate wealth and > expand freedom is unmatched,but this crisis has reminded us that > without a watchful eye,the market can spin out of control --and > that a nation cannot prosper long when it favors only the > prosperous.The success of our economy has always dependednot just > on the size of our gross domestic product,but on the reach of our > prosperity;on our ability to extend opportunity to every willing > heart --not out of charity, but because it is the surest route to > our common good.As for our common defense,we reject as false the > choice between our safety and our ideals.Our founding fathers, > faced with perils we can scarcely imagine,drafted a charter to > assure the rule of law and the rights of man,a charter expanded by > the blood of generations.Those ideals still light the world,and we > will not give them up for expedience's sake.And so to all the > other peoples and governments who are watching today,from the > grandest capitals to the small village where my father was > born:know that America is a friend of each nation andevery man, > woman, and child who seeks a future of peace and dignity,and that > we areready to lead once more.Recall that earlier generations > faced down fascism and communismnot just with missiles and tanks, > butwith sturdy alliances and enduring convictions.They understood > that our power alone cannot protect us,nor does it entitle us to > do as we please.Instead, they knew that our power grows through > its prudent use;our security emanates from the justness of our > cause,the force of our example, the tempering qualities of > humility and restraint.We are the keepers of this legacy. Guided > by these principles once more,we can meet those new threats that > demand even greater effort --even greater cooperation and > understanding between nations.We will begin to responsibly leave > Iraq to its people,and forge a hard-earned peace in > Afghanistan.With old friends and former foes,we will work > tirelessly to lessen the nuclear threat,and roll back the specter > of a warming planet.We will not apologize for our way of life, nor > will we waver in its defense,and for those who seek to advance > their aimsby inducing terror and slaughtering innocents,we say to > you now that our spirit is stronger and cannot be broken;you > cannot outlast us,and we will defeat you.For we know that our > patchwork heritage is a strength, not a weakness.We are a nation > of Christians and Muslims, Jews and Hindus -- and non-believers.We > are shaped by every language and culture, drawn from every end of > this Earth;and because we have tasted the bitter swill of civil > war and segregation,and emerged from that dark chapter stronger > and more united,we cannot help but believe that the old hatreds > shall someday pass;that the lines of tribe shall soon > dissolve;that as the world grows smaller, our common humanity > shall reveal itself;and that America must play its role in > ushering in a new era of peace.To the Muslim world, we seek a new > way forward,based on mutual interest and mutual respect.To those > leaders around the globe who seek to sow conflict,or blame their > society's ills on the West -- know thatyour people will judge you > on what you can build, not what you destroy.To those who cling to > power throughcorruption and deceit and the silencing of > dissent,know that you are on the wrong side of history; but that > we will extend a hand if you are willing to unclench your fist.To > the people of poor nations, we pledge to work alongside youto make > your farms flourish and let clean waters flow;to nourish starved > bodies and feed hungry minds.And to those nations like ours that > enjoy relative plenty,we say we can no longer afford indifference > to the suffering outside our borders;nor can we consume the > world's resources without regard to effect.For the world has > changed,and we must change with it.As we consider the road that > unfolds before us,we remember with humble gratitude those brave > Americans who,at this very hour, patrol far-off deserts and > distant mountains.They have something to tell us today,just as the > fallen heroes who lie in Arlington whisper through the ages.We > honor them not only because they are guardians of our liberty,but > because they embody the spirit of service;a willingness to find > meaning in something greater than themselves.And yet, at this > moment -- a moment that will define a generation --it is precisely > this spiritthat must inhabit us all.For as much as government can > do and must do,it is ultimately the faith and determination of the > American peopleupon which this nation relies.It is the kindness to > take in a stranger when the levees break,the selflessness of > workers who would rathercut their hours than see a friend lose > their jobwhich sees us through our darkest hours.It is the > firefighter's courage to storm a stairway filled with smoke,but > also a parent's willingness to nurture a child,that finally > decides our fate.Our challenges may be new.The instruments with > which we meet them may be new.But those values upon which our > success depends --hard work and honesty, courage and fair > play,tolerance and curiosity, loyalty and patriotism --these > things are old. These things are true.They have been the quiet > force of progress throughout our history.What is demanded then is > a return to these truths.What is required of us now is a new era > of responsibility --a recognition, on the part of every American, > thatwe have duties to ourselves, our nation, and the world,duties > that we do not grudgingly accept but rather seize gladly,firm in > the knowledge that there is nothing so satisfying to the spirit,so > defining of our character,than giving our all to a difficult > task.This is the price and the promise of citizenship.This is the > source of our confidence --the knowledge that God calls on us to > shape an uncertain destiny.This is the meaning of our liberty and > our creed --why men and women and children of every race and every > faithcan join in celebration across this magnificent Mall,and why > a man whose father less than sixty years agomight not have been > served at a local restaurant cannow stand before you to take a > most sacred oath.So let us mark this day with remembrance,of who > we are and how far we have traveled.In the year of America's > birth, in the coldest of months,a small band of patriots huddled > bydying campfires on the shores of an icy river.The capital was > abandoned. The enemy was advancing.The snow was stained with > blood.At a moment when the outcome of our revolution was most in > doubt,the father of our nation ordered these words be read to the > people:"Let it be told to the future world... that in the depth of > winter,when nothing but hope and virtue could survive...that the > city and the country, alarmed at one common danger,came forth to > meet (it)."America, in the face of our common dangers, in this > winter of our hardship,let us remember these timeless words. With > hope and virtue,let us brave once more the icy currents, and > endure what storms may come.Let it be said by our children's > children that when we were testedwe refused to let this journey > end, that we did not turn back nor did we falter;and with eyes > fixed on the horizon and God's grace upon us,we carried forth that > great gift of freedomand delivered it safely to future > generations.Thank you. God bless you. And God bless the United > States of America.**************A Good Credit Score is 700 or > Above. See yours in just 2 easy steps! > (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=DecemailfooterNO62) From marcello at perathoner.de Fri Jan 23 23:13:40 2009 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat, 24 Jan 2009 08:13:40 +0100 Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: Message-ID: <497ABFA4.4080403@perathoner.de> Michael Hart wrote: > 2. "The Mall," in caps, is usualy something used by Washington > people more than the rest f the ountr for /that/ mall. http://en.wikipedia.org/wiki/National_Mall http://en.wikipedia.org/wiki/National_Mall#The_Presidential_Inauguration -- Marcello Perathoner, Cologne, Germany webmaster at gutenberg.org From hart at pglaf.org Sat Jan 24 03:57:29 2009 From: hart at pglaf.org (Michael Hart) Date: Sat, 24 Jan 2009 03:57:29 -0800 (PST) Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: <497ABFA4.4080403@perathoner.de> References: <497ABFA4.4080403@perathoner.de> Message-ID: Sorry, "The National Mall" does not equal "this magnificent mall." I suupose you would write, "this magnificent House" in reference to "The White House," too, but I would only write it that way in reference to "The House Of Representatives." mh On Sat, 24 Jan 2009, Marcello Perathoner wrote: > Michael Hart wrote: > >> 2. "The Mall," in caps, is usualy something used by Washington >> people more than the rest f the ountr for /that/ mall. > > http://en.wikipedia.org/wiki/National_Mall > > http://en.wikipedia.org/wiki/National_Mall#The_Presidential_Inauguration > > > -- > Marcello Perathoner, Cologne, Germany > webmaster at gutenberg.org > From hart at pglaf.org Sat Jan 24 04:31:44 2009 From: hart at pglaf.org (Michael Hart) Date: Sat, 24 Jan 2009 04:31:44 -0800 (PST) Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: <497ABFA4.4080403@perathoner.de> References: <497ABFA4.4080403@perathoner.de> Message-ID: Given that the official transcript had "Mall". . .it is rather amazing that half the publications I found of the speech had, "mall" in not caps, is certainly a sign, not only that The Mall is not always capitalized when referenced as "this. . .mall, but a very good sign that all these publications must have done something independent of just using the official speech as it was handed out from the official sources. I wonder how many people would capitalize "this Part" when the reference was to any of the U.S. National Park System parks, Yellowstone, Mount Rainier, or any others, including The Mall. I capitalize "The National Road" in reference to U.S. 40 parts, where that highway had that name, and perhaps beyond, but if we reference "this magnificent road" not too many would. # Full Transcript: President Obama's inauguration address ... Jan 24, 2009 ... and our creed ? why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... www.welt.de/english-news/article3062276/President-Obamas-inauguration-address.html - 132k - 12 hours ago - Cached - Similar pages # 2 million people crowd in to witness oath | The Journal Gazette Jan 20, 2009 ... creed," Obama said, "why men and women and children of every race and every faith can join in celebration across this magnificent mall. ... www.journalgazette.net/apps/pbcs.dll/article?AID=/20090120/NEWS03/901209912 - 46k - Cached - Similar pages # Khaki Wishes and Cookie Dreams ... of our liberty and our creed - why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... ardenashley.tumblr.com/ - 50k - Cached - Similar pages # Obama's speech- American majorities & minorities? - Yahoo! Answers "...men and women and children of every race and every faith can join in celebration across this magnificent mall." every race=everyone. 7 minutes ago ... answers.yahoo.com/question/index?qid=20090123184518AAsrHit - Similar pages # # Salon.com News | "A new era of responsibility" Jan 20, 2009 ... "Why men and women and children of every race and every faith can join in celebration across this magnificent mall, and why a man whose ... www.salon.com/news/feature/2009/01/20/obama_inaugural/print.html - Similar pages # Many African-Americans felt they had to be there to believe it ... ... of our liberty and our creed ? why men and women and children of every race and every faith can join in celebration across this magnificent mall.? ... www.buffalonews.com/246/story/555531.html - 43k - 10 hours ago - Cached - Similar pages # President Obama ? yep, it's real | The Loop Jan 22, 2009 ... every faith can join in celebration across this magnificent mall, and why a man whose father less than sixty years ago might not have. theloop21.com/blogs/president-obama-?-yep-its-real - 28k - Cached - Similar pages # President Obama asks nation to pull together | detnews.com | The ... Jan 21, 2009 ... "Men and women and children of every race and every faith can join in celebration across this magnificent mall," Obama said, adding that "a ... detnews.com/apps/pbcs.dll/article?AID=/20090121/POLITICS/901210381/1022/POLITICS - Similar pages # Mobile.Politico.com: His journey, and ours, begins Jan 23, 2009 ... and our creed - - why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... mobile.politico.com/story.cfm?id=17685&cat=topnews - 14k - Cached - Similar pages # Barack Obama says nation has chosen 'hope over fear' | http, see ... Jan 20, 2009 ... our liberty and our creed - why men and women and children of every race and faith can join in celebration across this magnificent mall, ... www.ocregister.com/articles/http-see-inauguration-2285636-coverage-href - 85k - Cached - Similar pages # Obama Offers Internationalist Vision ... loyalty and patriotism" explained "why men and women and children of every race and every faith can join celebration across this magnificent mall, ... jonesreport.com/article/01_09/21intl.html - 15k - Cached - Similar pages # NHK ????? ?????????? ... why men and women and children of every race and every faith can join in celebration across this magnificent mall; and why a man whose father less than ... www.nhk.or.jp/toppage/obama2009/ - Similar pages # num?rique : bonjour ... creed -- why men and women and children of every race and every faith can join in celebration across this magnificent mall, and why a man whose father, ... bonjour.blogs.courrierinternational.com/tag/num?rique - 38k - Cached - Similar pages # # Living Martin Luther King Jr.'s dream on the National Mall ... Jan 22, 2009 ... and our creed ? why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... www.chicagotribune.com/news/nationworld/chi-civil-rights_wedjan21,0,6440273.story - 135k - Cached - Similar pages # # A moment of utter integrity - The Irish Times - Wed, Jan 21, 2009 Jan 21, 2009 ... happiness? and about ?why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... www.irishtimes.com/newspaper/world/2009/0121/1232474671073.html - 45k - Cached - Similar pages # # 78k - 12 hours ago - Cached - Similar pages # The Wild Frontier at The Times ? Blog Archive ? Barack Obama ... Jan 20, 2009 ... and our creed - why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... blogs.thetimes.co.za/hartley/2009/01/20/barack-obama-inauguration-speech-key-extracts/ - 63k - Cached - Similar pages # Obama calls for 'era of responsibility': Top Stories | adn.com Jan 16, 2009 ... why men and women and children of every race and every faith can join in celebration across this magnificent mall, and why a man whose ... www.adn.com/626/story/660724.html - 63k - Cached - Similar pages # ABC Local - Transcript: Obama's inaugural address Jan 21, 2009 ... and our creed -- why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... www.abc.net.au/news/stories/2009/01/21/2470567.htm?site=local - 28k - Cached - Similar pages # # Winds of Hope Are Blowing Across America ? Lingua Franca Jan 20, 2009 ... liberty and our creed?why men and women and children of every race and every faith can join in celebration across this magnificent mall, ... epiac1216.wordpress.com/2009/01/20/winds-of-hope-are-blowing-over-america/ - 35k - Cached - Similar pages # # The Pitt News - Obama swears in as 44th president ?Why men and women and children of every race and every faith can join in celebration across this magnificent mall, and why a man whose father less than 60 ... www.pittnews.com/news/obama_swears_in_as_44th_president - 28k - Cached - Similar pages # Obama: 'We Have Chosen Hope' - CNN.com | Rep?blica.net Jan 20, 2009 ... and women and children of every race and every faith can join in celebration across this magnificent mall, and why a man whose father, ... www.republica.net/node/156 - 24k - Cached - Similar pages On Sat, 24 Jan 2009, Marcello Perathoner wrote: > Michael Hart wrote: > >> 2. "The Mall," in caps, is usualy something used by Washington >> people more than the rest f the ountr for /that/ mall. > > http://en.wikipedia.org/wiki/National_Mall > > http://en.wikipedia.org/wiki/National_Mall#The_Presidential_Inauguration > > > -- > Marcello Perathoner, Cologne, Germany > webmaster at gutenberg.org > From hart at pglaf.org Sat Jan 24 05:00:23 2009 From: hart at pglaf.org (Michael Hart) Date: Sat, 24 Jan 2009 05:00:23 -0800 (PST) Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: <497ABFA4.4080403@perathoner.de> Message-ID: Before Marcello Perathoner comes back with his usual inundation of proof that he is, indeed, The King Of Trivial Pursuit, which I see as an acknowledged fact, and give him his due respect on that, I'm just going to say there are going to obviously be plenty of source material for both side, and leave it that I am pleased that larger numbers of media outlets used independent sources for the speech-- far larger than I would have expected, and I salute them. Oviously not all of the independents said "mall" instead of "Mall" . . .it is but one indication. . . . . While freeing myself the Trivial Pursuit aspects here, I have some additional information I came across about one of our previous hat tricks of the pursuit of perhaps slightly more importance. "We who are about to die salute you." In that conflagration, it was suggested that Julius Caesar was not the first of the caesars/Caesars, but one of a long line. While I disagreed, I let it drop without too much research, at the time of the discussion, but I came across this recently: "In the distribution of the days through the several months, Caesar adopted a simpler arrangement than that which we have now. He had ordered that the first, third, fifth, seventh, ninth, and eleventh months, that is January, March, May, July, September and November, should each have thirty-one days, and the other months thirty, except February, which in common years should have only twenty-nine day, but every fourth year thirty days. This order was interrupted in 8 BC to gratify the vanity of Augustus, by giving the month bearing his name as many days as July, which had been re-named after the first Caesar during 44BC. A day was accordingly taken from ^^^^^^^^^^^^^^^^ February and given to August; and in order that three months of thirty-one days might not come together, September and November were reduced to thirty days, and thirty-one given to October and December." From marcello at perathoner.de Sat Jan 24 10:43:59 2009 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat, 24 Jan 2009 19:43:59 +0100 Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: <497ABFA4.4080403@perathoner.de> Message-ID: <497B616F.3060409@perathoner.de> Michael Hart wrote: > > Given that the official transcript had "Mall". . .it is rather amazing > that half the publications I found of the speech had, > "mall" in not caps, is certainly a sign, not only that The Mall > is not always capitalized when referenced as "this. . .mall, > but a very good sign that all these publications must have done > something independent of just using the official speech as it > was handed out from the official sources. > > I wonder how many people would capitalize "this Part" when the > reference was to any of the U.S. National Park System parks, > Yellowstone, Mount Rainier, or any others, including The Mall. > > I capitalize "The National Road" in reference to U.S. 40 parts, > where that highway had that name, and perhaps beyond, but if we > reference "this magnificent road" not too many would. Busy day? > > > > # > Full Transcript: President Obama's inauguration address ... > Jan 24, 2009 ... and our creed ? why men and women and children of every > race and every faith can join in celebration across this magnificent > mall, ... > www.welt.de/english-news/article3062276/President-Obamas-inauguration-address.html > - 132k - 12 hours ago - Cached - Similar pages > # > > 2 million people crowd in to witness oath | The Journal Gazette > Jan 20, 2009 ... creed," Obama said, "why men and women and children of > every race and every faith can join in celebration across this > magnificent mall. ... > www.journalgazette.net/apps/pbcs.dll/article?AID=/20090120/NEWS03/901209912 > - 46k - Cached - Similar pages > > # > Khaki Wishes and Cookie Dreams > ... of our liberty and our creed - why men and women and children of > every race and every faith can join in celebration across this > magnificent mall, ... > ardenashley.tumblr.com/ - 50k - Cached - Similar pages > # > Obama's speech- American majorities & minorities? - Yahoo! Answers > "...men and women and children of every race and every faith can join in > celebration across this magnificent mall." every race=everyone. 7 > minutes ago ... > answers.yahoo.com/question/index?qid=20090123184518AAsrHit - Similar pages > # > > > # > Salon.com News | "A new era of responsibility" > Jan 20, 2009 ... "Why men and women and children of every race and every > faith can join in celebration across this magnificent mall, and why a > man whose ... > www.salon.com/news/feature/2009/01/20/obama_inaugural/print.html - > Similar pages > # > Many African-Americans felt they had to be there to believe it ... > ... of our liberty and our creed ? why men and women and children of > every race and every faith can join in celebration across this > magnificent mall.? ... > www.buffalonews.com/246/story/555531.html - 43k - 10 hours ago - Cached > - Similar pages > # > President Obama ? yep, it's real | The Loop > Jan 22, 2009 ... every faith can join in celebration across this > magnificent mall, and why a man whose father less than sixty years ago > might not have. > theloop21.com/blogs/president-obama-?-yep-its-real - 28k - Cached - > Similar pages > # > President Obama asks nation to pull together | detnews.com | The ... > Jan 21, 2009 ... "Men and women and children of every race and every > faith can join in celebration across this magnificent mall," Obama said, > adding that "a ... > detnews.com/apps/pbcs.dll/article?AID=/20090121/POLITICS/901210381/1022/POLITICS > - Similar pages > # > Mobile.Politico.com: His journey, and ours, begins > Jan 23, 2009 ... and our creed - - why men and women and children of > every race and every faith can join in celebration across this > magnificent mall, ... > mobile.politico.com/story.cfm?id=17685&cat=topnews - 14k - Cached - > Similar pages > # > Barack Obama says nation has chosen 'hope over fear' | http, see ... > Jan 20, 2009 ... our liberty and our creed - why men and women and > children of every race and faith can join in celebration across this > magnificent mall, ... > www.ocregister.com/articles/http-see-inauguration-2285636-coverage-href > - 85k - Cached - Similar pages > # > Obama Offers Internationalist Vision > ... loyalty and patriotism" explained "why men and women and children of > every race and every faith can join celebration across this magnificent > mall, ... > jonesreport.com/article/01_09/21intl.html - 15k - Cached - Similar pages > # > NHK ????? ?????????? > ... why men and women and children of every race and every faith can > join in celebration across this magnificent mall; and why a man whose > father less than ... > www.nhk.or.jp/toppage/obama2009/ - Similar pages > # > num?rique : bonjour > ... creed -- why men and women and children of every race and every > faith can join in celebration across this magnificent mall, and why a > man whose father, ... > bonjour.blogs.courrierinternational.com/tag/num?rique - 38k - Cached - > Similar pages > # > > > # > Living Martin Luther King Jr.'s dream on the National Mall ... > Jan 22, 2009 ... and our creed ? why men and women and children of every > race and every faith can join in celebration across this magnificent > mall, ... > www.chicagotribune.com/news/nationworld/chi-civil-rights_wedjan21,0,6440273.story > - 135k - Cached - Similar pages > # > > > # > A moment of utter integrity - The Irish Times - Wed, Jan 21, 2009 > Jan 21, 2009 ... happiness? and about ?why men and women and children of > every race and every faith can join in celebration across this > magnificent mall, ... > www.irishtimes.com/newspaper/world/2009/0121/1232474671073.html - 45k - > Cached - Similar pages > # > > > # > 78k - 12 hours ago - Cached - Similar pages > # > The Wild Frontier at The Times ? Blog Archive ? Barack Obama ... > Jan 20, 2009 ... and our creed - why men and women and children of every > race and every faith can join in celebration across this magnificent > mall, ... > blogs.thetimes.co.za/hartley/2009/01/20/barack-obama-inauguration-speech-key-extracts/ > - 63k - Cached - Similar pages > # > Obama calls for 'era of responsibility': Top Stories | adn.com > Jan 16, 2009 ... why men and women and children of every race and every > faith can join in celebration across this magnificent mall, and why a > man whose ... > www.adn.com/626/story/660724.html - 63k - Cached - Similar pages > # > ABC Local - Transcript: Obama's inaugural address > Jan 21, 2009 ... and our creed -- why men and women and children of > every race and every faith can join in celebration across this > magnificent mall, ... > www.abc.net.au/news/stories/2009/01/21/2470567.htm?site=local - 28k - > Cached - Similar pages > # > > # > Winds of Hope Are Blowing Across America ? Lingua Franca > Jan 20, 2009 ... liberty and our creed?why men and women and children of > every race and every faith can join in celebration across this > magnificent mall, ... > epiac1216.wordpress.com/2009/01/20/winds-of-hope-are-blowing-over-america/ > - 35k - Cached - Similar pages > # > > # > The Pitt News - Obama swears in as 44th president > ?Why men and women and children of every race and every faith can join > in celebration across this magnificent mall, and why a man whose father > less than 60 ... > www.pittnews.com/news/obama_swears_in_as_44th_president - 28k - Cached - > Similar pages > # > Obama: 'We Have Chosen Hope' - CNN.com | Rep?blica.net > Jan 20, 2009 ... and women and children of every race and every faith > can join in celebration across this magnificent mall, and why a man > whose father, ... > www.republica.net/node/156 - 24k - Cached - Similar pages > > > On Sat, 24 Jan 2009, Marcello Perathoner wrote: > >> Michael Hart wrote: >> >>> 2. "The Mall," in caps, is usualy something used by Washington >>> people more than the rest f the ountr for /that/ mall. >> >> http://en.wikipedia.org/wiki/National_Mall >> >> http://en.wikipedia.org/wiki/National_Mall#The_Presidential_Inauguration >> >> >> -- >> Marcello Perathoner, Cologne, Germany >> webmaster at gutenberg.org >> > > ------------------------------------------------------------------------ > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -- Marcello Perathoner, Cologne, Germany webmaster at gutenberg.org From marcello at perathoner.de Sat Jan 24 11:23:34 2009 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat, 24 Jan 2009 20:23:34 +0100 Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: <497ABFA4.4080403@perathoner.de> Message-ID: <497B6AB6.10302@perathoner.de> Michael Hart wrote: > Sorry, "The National Mall" does not equal "this magnificent mall." > > I suupose you would write, "this magnificent House" in reference > to "The White House," too, but I would only write it that way in > reference to "The House Of Representatives." I would write "this magnificent mall". You failed to notice that I didn't take offense at the capitalization (which is _correct_ in the transcript, and _wrong_ as given by Bowerbird) but to your statement regarding "the rest f the ountr [sic]". > > mh > > > On Sat, 24 Jan 2009, Marcello Perathoner wrote: > >> Michael Hart wrote: >> >>> 2. "The Mall," in caps, is usualy something used by Washington >>> people more than the rest f the ountr for /that/ mall. >> >> http://en.wikipedia.org/wiki/National_Mall >> >> http://en.wikipedia.org/wiki/National_Mall#The_Presidential_Inauguration >> >> >> -- >> Marcello Perathoner, Cologne, Germany >> webmaster at gutenberg.org >> > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -- Marcello Perathoner, Cologne, Germany webmaster at gutenberg.org From Bowerbird at aol.com Sat Jan 24 14:03:59 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 24 Jan 2009 17:03:59 EST Subject: [gutvol-d] obama inaugural speech transcript Message-ID: michael- you certainly summoned a lot of evidence! :+) but i'm gonna stick with my version of "mall" capitalized, since the prepared version of the speech had it that way, and i presume that to be the intent of the author... in most ways, i would defer to the _delivered_ version, since it is what was actually _said_ that matters more... it's just that i cannot tell from his vocalization whether he "capitalized" the word "mall" when he spoke it... ;+) likewise, as far as paragraph-breaks go, i used his, since -- you know -- i figure that he is the author... but his linebreaks seemed arbitrary, so i inserted my own. ;+) i also decapitalized "scripture", but looked over my shoulder, just to make sure god wasn't throwing a thunderbolt at me... i saw your take on the "expedience" thing. i'll have to study it. according to the best i could glean from wikipedia, you're still mistaken on your spelling of "khe sanh". (strange reference, wasn't it?, since vietnam is now universally acknowledged as a "useless" war, but perhaps he wants to stress the point that we still have to "honor the troops" even when the war is pointless, so he doesn't have that problem as he withdraws us from iraq.) i noticed we disagreed on a few of the "inserted" words, including a "today" that i was unsure about, so i guess i have to listen to the speech while comparing it to my version directly, to know for sure. oh yeah, i liked what you did on "founding fathers" -- indicating that he paused for applause, and then repeated what he'd said -- since i assume that is what actually happened, even though the transcript i was using indicated that he'd done a word-belch... but unlike you, i won't repeat the phrase, because my outlook is that if the audience didn't hear it the first time he said it -- which is why he had to repeat it -- then it didn't really "count". otherwise, i'm surprised there were so few differences between the speech as written versus delivered. he is a gifted speaker, and had lots of practice in his years out on the campaign trail. it's like the teleprompter is his best friend... -bowerbird ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hart at pglaf.org Sat Jan 24 14:41:05 2009 From: hart at pglaf.org (Michael Hart) Date: Sat, 24 Jan 2009 14:41:05 -0800 (PST) Subject: [gutvol-d] obama inaugural speech transcript In-Reply-To: References: Message-ID: On Sat, 24 Jan 2009, Bowerbird at aol.com wrote: > michael- > > you certainly summoned a lot of evidence! :+) That was just of a first page of the google search on the phrase. > but i'm gonna stick with my version of "mall" capitalized, > since the prepared version of the speech had it that way, > and i presume that to be the intent of the author... I serious doubt that HE was the single author. . . . Never actually considered it could have been otherise. However, I am proud to be a Washington "outsider." > in most ways, i would defer to the _delivered_ version, > since it is what was actually _said_ that matters more... > > it's just that i cannot tell from his vocalization whether > he "capitalized" the word "mall" when he spoke it... ;+) I listened to the entire speech over and over and over. I didn't hear any caps there. . . . > likewise, as far as paragraph-breaks go, i used his, > since -- you know -- i figure that he is the author... Again, live delivery is not always what was planned. Personally, I think mine has more impact, but. . . . > but his linebreaks seemed arbitrary, so i inserted my own. ;+) Of course, there ARE some whose line breaks are NOT arbitrary. > i also decapitalized "scripture", but looked over my shoulder, > just to make sure god wasn't throwing a thunderbolt at me... How do you do with Bible? Quran? Gita? I Ching? Mormon? etc? > i saw your take on the "expedience" thing. i'll have to study it. That was the hardest decision of all, I listened over and over, and tried to give it the best rendering possible. . . . I'm not actually sure that is how he meant to say it. . . . > according to the best i could glean from wikipedia, you're still > mistaken on your spelling of "khe sanh". (strange reference, > wasn't it?, since vietnam is now universally acknowledged as a > "useless" war, but perhaps he wants to stress the point that we > still have to "honor the troops" even when the war is pointless, > so he doesn't have that problem as he withdraws us from iraq.) Actually, I thing my Khe Sanh spelling got inverted somehow, as with scarcely, will make sure they are fixed in final copy. > i noticed we disagreed on a few of the "inserted" words, including > a "today" that i was unsure about, so i guess i have to listen to the > speech while comparing it to my version directly, to know for sure. He also reversed a couple words. I will gladly relisten to any part you would like me to, I have the tape still right in front of me. > oh yeah, i liked what you did on "founding fathers" -- indicating > that he paused for applause, and then repeated what he'd said -- > since i assume that is what actually happened, even though the > transcript i was using indicated that he'd done a word-belch... > > but unlike you, i won't repeat the phrase, because my outlook > is that if the audience didn't hear it the first time he said it -- > which is why he had to repeat it -- then it didn't really "count". I think the audience heard it fine, but that he simply wanted to get back on the rhythm of the phrasing. > otherwise, i'm surprised there were so few differences between > the speech as written versus delivered. he is a gifted speaker, > and had lots of practice in his years out on the campaign trail. > it's like the teleprompter is his best friend... I think he intended to stick to the script, but that he is just such a natural speaker that a few words came out his own way. mh > > -bowerbird > > > > ************** > A Good Credit Score is 700 or Above. See yours in just 2 easy > steps! > (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De > cemailfooterNO62) > From Bowerbird at aol.com Tue Jan 27 16:01:12 2009 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 27 Jan 2009 19:01:12 EST Subject: [gutvol-d] those friendly people at distributed proofreaders Message-ID: those friendly people at distributed proofreaders sometimes aren't quite so friendly after all... for instance, consider this thread: > http://www.pgdp.net/phpBB2/viewtopic.php?t=37215 a p1 proofer suggests that the p1 proofers should stay away from projects with no "good words" list, since "wordcheck" creates far too many false-alarms. first notice that -- as a p1 proofer -- he isn't even obligated, in any way at all, to even _use_ wordcheck. but he still does, because he's a dedicated volunteer. precisely because he does use wordcheck, he sees -- in p1, on projects with an inactive project manager -- that wordcheck causes far too many false-alarms... and what happens? the d.p. old-timers bust his ass. at least most of them. a few _do_ come in, to say that he's got a good point. but for the most part, they are highly critical of him... because he had the balls to voice a valid criticism... it's relevant to the series i've been running this month, because if d.p. did a decent job of doing preprocessing, they'd have _excellent_ good-word lists from the outset. failing that, however, it's certainly the case that they can re-program "wordcheck" to ignore any words that have been suggested by a proofer when those words reappear on another page _for_that_same_proofer_, as a gesture... they could also re-program it so that if a word were to be suggested, it automatically goes on the "good word" list, subject then to the later review by the project manager... or, if that's too generous for you, you could specify that it'd be automatically added if _two_ proofers suggest it, or _three_, or whatever number you believe to be fair... the point is, when a word shows up dozens and dozens of times, it's just unfair to dedicated volunteer proofers to flag it every single time. and more than being unfair, it's _unwise_, because it teaches them to ignore the flags. -bowerbird p.s. in fairness to juliet, she did write a comment saying she hoped that "everyone will maintain our usual high level of respect for each other in our public discourses", which was an acknowledgment that some went too far... ************** A Good Credit Score is 700 or Above. See yours in just 2 easy steps! (http://pr.atwola.com/promoclk/100000075x1215855013x1201028747/aol?redir=http://www.freecreditreport.com/pm/default.aspx?sc=668072%26hmpgID=62%26bcd=De cemailfooterNO62) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf at ark.in-berlin.de Thu Jan 29 03:32:34 2009 From: ralf at ark.in-berlin.de (Ralf Stephan) Date: Thu, 29 Jan 2009 12:32:34 +0100 Subject: [gutvol-d] let's make a book-tree Message-ID: <20090129113234.GA18988@ark.in-berlin.de> In a molecular biology context, a researcher has developed a new way to search/relate books. Can someone find where we can download some code? http://esciencenews.com/articles/2009/01/28/new.computational.technique.allows.comparison.whole.genomes.easily.whole.books Excerpt: In a test of free online books obtained through Project Gutenberg, they found that this method, which they called the feature frequency profile (FFP) method, was more successful at identifying related books - books by the same author, books of the same genre, books from the same historical era - than word frequency profile analysis. In fact, a good tree can be constructed by looking at a single "optimal" feature length, such as nine letters, where the "vocabulary" is very large, instead of looking at all possible lengths. "I was just stunned when I saw this," Kim said. One of the reasons this method works better, he said, may be that, while word frequency analysis treats each word independently, feature frequency analysis picks up syntax. "Here, if I take a 9-letter window and slide it along the text," he said, "I am actually picking up the relationship between the first and second words - the local syntax - which was impossible to pick up from the word frequency method. Apparently, that is very important." Buoyed by this success, the researchers applied the technique to whole genomes of mammals, where there is the least controversy in evolutionary relationship. "We treat the genome like a book without spaces," Kim said. ralf From schultzk at uni-trier.de Fri Jan 30 01:22:13 2009 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Fri, 30 Jan 2009 10:22:13 +0100 Subject: [gutvol-d] let's make a book-tree In-Reply-To: <20090129113234.GA18988@ark.in-berlin.de> References: <20090129113234.GA18988@ark.in-berlin.de> Message-ID: <3A309ED9-6A77-4A85-B0D0-8511CDE8A86A@uni-trier.de> Hi Ralf, The method you describe is a simple form of socalled Markov Chaining. Depending on the rest of the math done you not only get syntax information, but also semantic informtion in the statistics. In other words meaning. So it is natural that the results are far better than simple word statistics. Another approach could be is to do the work on the word level instead of letters. Here you capture context directly. A bit of warning though. The results are impressive in the short. But in real world large scale use things quickly become fuzzy and the results not very helpful in the end because of the vast amount of data and the statistical analsis give a high rate of cofindence, but far to many hits to be practical. Hope this helps Keith. Am 29.01.2009 um 12:32 schrieb Ralf Stephan: > In a molecular biology context, a researcher has developed > a new way to search/relate books. Can someone find where we > can download some code? > > http://esciencenews.com/articles/2009/01/28/ > new.computational.technique.allows.comparison.whole.genomes.easily.who > le.books > > Excerpt: > > In a test of free online books obtained through Project Gutenberg, > they > found that this method, which they called the feature frequency > profile > (FFP) method, was more successful at identifying related books - books > by the same author, books of the same genre, books from the same > historical era - than word frequency profile analysis. In fact, a good > tree can be constructed by looking at a single "optimal" feature > length, > such as nine letters, where the "vocabulary" is very large, instead of > looking at all possible lengths. > > "I was just stunned when I saw this," Kim said. One of the reasons > this > method works better, he said, may be that, while word frequency > analysis > treats each word independently, feature frequency analysis picks up > syntax. > > "Here, if I take a 9-letter window and slide it along the text," he > said, "I am actually picking up the relationship between the first and > second words - the local syntax - which was impossible to pick up from > the word frequency method. Apparently, that is very important." > > Buoyed by this success, the researchers applied the technique to whole > genomes of mammals, where there is the least controversy in > evolutionary > relationship. "We treat the genome like a book without spaces," Kim > said. > > > ralf > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d