From jon.ingram at gmail.com Sun Oct 1 03:40:51 2006 From: jon.ingram at gmail.com (Jon Ingram) Date: Sun Oct 1 03:41:03 2006 Subject: [gutvol-d] PG Examples of XHTML and CSS? In-Reply-To: <000001c6e507$f9bb6ee0$1f12fea9@sarek> References: <000001c6e507$f9bb6ee0$1f12fea9@sarek> Message-ID: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> On 10/1/06, John Hagerson wrote: > If I remember correctly, someone was creating PG texts using CSS and XHTML, > but I don't remember who it was. I would like to see an example that uses > these technologies. The W3.org website has all of the information, but > sometimes it's like trying to find a needle in a haystack to find the answer > to a specific question. > > If someone could provide the name of the poster or an e-book number, that > would be very helpful. Thank you. Many of the books processed by the DP site in the last few years have had an XHTML version created. We even have very rough guidelines for the marking up of things like poetry and page numbers, although there's a lot of variation between individual projects. 'Uberprojects' like periodicals often have a style-guide which is followed by almost all the posted issues. You could take a look at individual issues to see which styles you like (or dislike). Here's a random Punch issue: http://www.gutenberg.org/etext/17397 And a random Scientific American issue: http://www.gutenberg.org/etext/11649 Everyone will have their favourite example of HTML/XHTML texts on PG. Personally I've been very impressed with some of the work that people have done on books I've scanned (which for some reason means that my name goes on the PG 'Produced by' line before them, which isn't a particularly fair reflection on the amount of work put in). Take a look for example at Tintinnalogia, or, the Art of Ringing, by Fabian Stedman http://www.gutenberg.org/etext/18567 Amusements in Mathematics, by Henry Dudeney http://www.gutenberg.org/etext/16713 The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken http://www.gutenberg.org/etext/13645 If you give more information about what particularly you're looking for, I might be able to be a bit more selective rather than throwing out random links to books I like! -- Jon Ingram From sly at victoria.tc.ca Sun Oct 1 09:23:31 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Oct 1 09:23:36 2006 Subject: [gutvol-d] PG Examples of XHTML and CSS? In-Reply-To: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> References: <000001c6e507$f9bb6ee0$1f12fea9@sarek> <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> Message-ID: Also, you might want to check out the experiences of DP volunteers in preparing html. I believe the general consesus has been that there is enough variation of needs between different projects, that trying to define one strict standard does not work. But some general guidelines have emerged. Start at the page: http://www.pgdp.net/wiki/HTML That includes a link to a "CSS bookbook" that you might find to be of interest. Andrew On Sun, 1 Oct 2006, Jon Ingram wrote: > On 10/1/06, John Hagerson wrote: > > If I remember correctly, someone was creating PG texts using CSS and XHTML, > > but I don't remember who it was. I would like to see an example that uses > > these technologies. The W3.org website has all of the information, but > > sometimes it's like trying to find a needle in a haystack to find the answer > > to a specific question. > > > > If someone could provide the name of the poster or an e-book number, that > > would be very helpful. Thank you. > > Many of the books processed by the DP site in the last few years have > had an XHTML version created. We even have very rough guidelines for > the marking up of things like poetry and page numbers, although > there's a lot of variation between individual projects. > > 'Uberprojects' like periodicals often have a style-guide which is > followed by almost all the posted issues. You could take a look at > individual issues to see which styles you like (or dislike). Here's a > random Punch issue: > http://www.gutenberg.org/etext/17397 > And a random Scientific American issue: > http://www.gutenberg.org/etext/11649 > > Everyone will have their favourite example of HTML/XHTML texts on PG. > Personally I've been very impressed with some of the work that people > have done on books I've scanned (which for some reason means that my > name goes on the PG 'Produced by' line before them, which isn't a > particularly fair reflection on the amount of work put in). Take a > look for example at > > Tintinnalogia, or, the Art of Ringing, by Fabian Stedman > http://www.gutenberg.org/etext/18567 > > Amusements in Mathematics, by Henry Dudeney > http://www.gutenberg.org/etext/16713 > > The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken > http://www.gutenberg.org/etext/13645 > > If you give more information about what particularly you're looking > for, I might be able to be a bit more selective rather than throwing > out random links to books I like! > > From jon at noring.name Sun Oct 1 11:59:10 2006 From: jon at noring.name (Jon Noring) Date: Sun Oct 1 11:59:22 2006 Subject: [gutvol-d] Alternate CSS style sheets for "My Antonia" requested (was: PG Examples of XHTML and CSS?) In-Reply-To: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> References: <000001c6e507$f9bb6ee0$1f12fea9@sarek> <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> Message-ID: <1108209902.20061001125910@noring.name> John Hagerson wrote: > If I remember correctly, someone was creating PG texts using CSS and XHTML, > but I don't remember who it was. I would like to see an example that uses > these technologies. The W3.org website has all of the information, but > sometimes it's like trying to find a needle in a haystack to find the answer > to a specific question. > > If someone could provide the name of the poster or an e-book number, that > would be very helpful. Thank you. I've placed online the book "My Antonia" by Willa Cather. It is valid to XHTML 1.1 with three different CSS style sheet options (and a version with no style sheet applied -- only browser defaults used): http://www.openreader.org/myantonia Jose Menendez has an HTML 4.01 version of the same book (and my version essentially relied on his for final proofing to catch the last remaining transcription errors) with an internal CSS style sheet: http://www.ibiblio.org/ebooks/Cather/ (I'm not sure, but Jose may have donated his text version to PG. Our versions differ in that mine is faithful to the original 1918 edition including some text errors found in the original printing -- Jose's is a corrected "reader" edition. But my edition does include markup to flag the text errors and provide what the correct text should be per Jose's corrections, plus a few listed at the UNL Cather site.) ***** Now, my text layout skills are downright pitiful, and anyone wishing to submit alternative CSS style sheets for my version of "My Antonia" is welcome to do so -- the more the merrier (every person submitting CSS will be acknowledged.) I believe the markup has sufficient structural and semantic granularity to do some pretty advanced CSS presentation. Jon Noring From j.hagerson at comcast.net Sun Oct 1 13:01:23 2006 From: j.hagerson at comcast.net (John Hagerson) Date: Sun Oct 1 13:01:34 2006 Subject: [gutvol-d] PG Examples of XHTML and CSS? In-Reply-To: Message-ID: <000001c6e594$5f012970$1f12fea9@sarek> Thank you very much Jon and Andrew. Between the samples listed, the cookbook, and the other resources noted on the PG wiki, I think I will be able to mark up the text I'm working on. I need to think more of semantic tags rather than presentation tags. There is a gestalt to this that I haven't quite mastered. From jon at noring.name Sun Oct 1 13:49:59 2006 From: jon at noring.name (Jon Noring) Date: Sun Oct 1 13:50:12 2006 Subject: [gutvol-d] PG Examples of XHTML and CSS? In-Reply-To: <000001c6e594$5f012970$1f12fea9@sarek> References: <000001c6e594$5f012970$1f12fea9@sarek> Message-ID: <1511133066.20061001144959@noring.name> John Hagerson wrote: > Thank you very much Jon and Andrew. Between the samples listed, the > cookbook, and the other resources noted on the PG wiki, I think I will be > able to mark up the text I'm working on. I need to think more of semantic > tags rather than presentation tags. There is a gestalt to this that I > haven't quite mastered. Glad to have been of help. My call for alternate style sheets for my version of "My Antonia" is possible only because the markup is strictly structural/semantic. Had I done old-fashioned HTML markup (where I mix in presentational tags along with the structural/semantic tags), it is no longer possible to have the flexibility of presentation. (It's also important NOT to use tables for layout purposes.) An interesting site which demonstrates the full power of CSS and the separation of presentation from structure is the CSS Zen Garden site: http://www.csszengarden.com/ where the same XHTML 1.0 Strict document (well essentially the same with respect to structural/semantic markup) is presented in hundreds of different ways solely by swapping the CSS style sheet (background images are also customized and applied using CSS). It's amazing what can be done with CSS applied to purely structural/semantic markup. Another important aspect of having structural/semantic-only markup is accessibility. Such documents have a high degree of accessibility (again, it is important NOT to use table markup for layout purposes if one wants maximal accessibility -- CSS Zen Garden shows that tables are not necessary for complex layouts.) A while back I did some XHTML markup on the "We Media" document for JD Lasica and the OurMedia project. I asked the CSS authoring community for alternate CSS style sheets for that document. Two people supplied CSS: http://www.openreader.org/wemedia/ (I like Bob's a little better. Note how readable the document is even without CSS, which is accomplished by proper XHTML markup.) Jon Noring From Bowerbird at aol.com Sun Oct 1 14:51:46 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Oct 1 14:51:53 2006 Subject: [gutvol-d] a 6-year-old Message-ID: <591.2094c8ac.32519272@aol.com> distributed proofreaders is 6 years old today. happy birthday! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061001/af4db866/attachment.html From schultzk at uni-trier.de Sun Oct 1 23:54:56 2006 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Sun Oct 1 23:55:02 2006 Subject: [gutvol-d] oh geez In-Reply-To: <517.7d6a5e4.324ebb23@aol.com> References: <517.7d6a5e4.324ebb23@aol.com> Message-ID: <6EF33CA4-2593-4C3C-912E-C83AC1CBD081@uni-trier.de> Hi Bowerbird, Thanx for the refresher course, but that was not my point. I AGREE with you fully mark-up is a pain in the old behind. Keith. Am 29.09.2006 um 20:08 schrieb Bowerbird@aol.com: > keith said: > > I doubt that very much!! > > Mark-up is a necessity of > > language and communication > > wether you see it or not. > > zen markup language > _is_ a form of "markup", > but it's the "light" kind -- > not that _heavy_ stuff -- > so it doesn't take much > time or money or energy > to "apply" it where needed. > > as to "whether you see it or not", > z.m.l. generally tries to be invisible. > [snip, snip rest deleted for brevity] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/545909d1/attachment.html From Bowerbird at aol.com Mon Oct 2 00:26:50 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 2 00:26:59 2006 Subject: [gutvol-d] oh geez Message-ID: <51f.7c17ac2.3252193a@aol.com> keith said: > I AGREE with you fully mark-up is a pain in the old behind. that's why -- when they realize markup is _also_ unnecessary -- people will leave it behind, immediately, like a bad housemate, and be relieved to be done with it, and swear never to go back... :+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/5c8bee3d/attachment.html From hyphen at hyphenologist.co.uk Mon Oct 2 02:21:40 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Mon Oct 2 02:21:53 2006 Subject: [gutvol-d] oh geez In-Reply-To: <51f.7c17ac2.3252193a@aol.com> References: <51f.7c17ac2.3252193a@aol.com> Message-ID: On Mon, 2 Oct 2006 03:26:50 EDT, Bowerbird@aol.com wrote: |keith said: |> I AGREE with you fully mark-up is a pain in the old behind. | |that's why -- when they realize markup is _also_ unnecessary -- |people will leave it behind, immediately, like a bad housemate, |and be relieved to be done with it, and swear never to go back... :+) I left Mark Up behind way back in 1985. For html I rely on NVU for html which is WISYWIG. -- Dave Fawthrop From nwolcott2ster at gmail.com Mon Oct 2 08:01:35 2006 From: nwolcott2ster at gmail.com (Norm Wolcott) Date: Mon Oct 2 08:09:06 2006 Subject: [gutvol-d] PG Examples of XHTML and CSS? References: <000001c6e507$f9bb6ee0$1f12fea9@sarek><4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com> Message-ID: <003c01c6e633$c07d5e40$640fa8c0@atlanticbb.net> If you care to go through the pain of installing Guiguts http://www.pgdp.net/wiki/PPTools/Guiguts There is an option to create a Css-XHTML web page from a text version made to PG standards. The HTML/CSS is quite generic and can be tweaked for individual needs. Since DP may take a couple of years to process a book now, doing your own thing may again be an option. nwolcott2@post.harvard.edu ----- Original Message ----- From: "Andrew Sly" To: "Project Gutenberg Volunteer Discussion" Sent: Sunday, October 01, 2006 12:23 PM Subject: Re: [gutvol-d] PG Examples of XHTML and CSS? > > Also, you might want to check out the experiences of DP > volunteers in preparing html. I believe the general > consesus has been that there is enough variation of > needs between different projects, that trying to define > one strict standard does not work. > > But some general guidelines have emerged. Start at the page: > http://www.pgdp.net/wiki/HTML > > That includes a link to a "CSS bookbook" that you might > find to be of interest. > > Andrew > > On Sun, 1 Oct 2006, Jon Ingram wrote: > > > On 10/1/06, John Hagerson wrote: > > > If I remember correctly, someone was creating PG texts using CSS and XHTML, > > > but I don't remember who it was. I would like to see an example that uses > > > these technologies. The W3.org website has all of the information, but > > > sometimes it's like trying to find a needle in a haystack to find the answer > > > to a specific question. > > > > > > If someone could provide the name of the poster or an e-book number, that > > > would be very helpful. Thank you. > > > > Many of the books processed by the DP site in the last few years have > > had an XHTML version created. We even have very rough guidelines for > > the marking up of things like poetry and page numbers, although > > there's a lot of variation between individual projects. > > > > 'Uberprojects' like periodicals often have a style-guide which is > > followed by almost all the posted issues. You could take a look at > > individual issues to see which styles you like (or dislike). Here's a > > random Punch issue: > > http://www.gutenberg.org/etext/17397 > > And a random Scientific American issue: > > http://www.gutenberg.org/etext/11649 > > > > Everyone will have their favourite example of HTML/XHTML texts on PG. > > Personally I've been very impressed with some of the work that people > > have done on books I've scanned (which for some reason means that my > > name goes on the PG 'Produced by' line before them, which isn't a > > particularly fair reflection on the amount of work put in). Take a > > look for example at > > > > Tintinnalogia, or, the Art of Ringing, by Fabian Stedman > > http://www.gutenberg.org/etext/18567 > > > > Amusements in Mathematics, by Henry Dudeney > > http://www.gutenberg.org/etext/16713 > > > > The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken > > http://www.gutenberg.org/etext/13645 > > > > If you give more information about what particularly you're looking > > for, I might be able to be a bit more selective rather than throwing > > out random links to books I like! > > > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From nwolcott2ster at gmail.com Mon Oct 2 08:13:41 2006 From: nwolcott2ster at gmail.com (Norm Wolcott) Date: Mon Oct 2 08:27:56 2006 Subject: [gutvol-d] Scraping text from Univ Mich googles Message-ID: <008701c6e637$463b02a0$640fa8c0@atlanticbb.net> Is there any way, other than a page by page scraping of the html from the text images provided by UMich for their google books--to get the whole text in one file, or thereabouts. The other question is does re-OCR'ing the page images give any better results than starting with the page texts given by google? Have any of the other google participants seen fit to privide the google text? nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/4ac42834/attachment.html From hart at pglaf.org Mon Oct 2 09:47:06 2006 From: hart at pglaf.org (Michael Hart) Date: Mon Oct 2 09:47:07 2006 Subject: [gutvol-d] oh geez, part 2 In-Reply-To: References: Message-ID: Once you find yourself sucked down into the mud, you'll find they they enjoy it, are practiced at it, and that the only way to beat them is to become one of them, which is total defeat. On Sat, 30 Sep 2006 Bowerbird@aol.com wrote: > jon said: >> Am I the only one to see who >> is really slinging the mud here? > > i did not resort to anything ad hominem. > > i said unflattering things, yep i sure did!, > but if any of them do not jibe with reality, > then by all means feel free to express that. > if i agree that i overstepped appropriateness, > i will be more than happy to issue an apology. > > while i talk about the issues, > david attacks me _personally_, > (and strays from the truth too), > instead of addressing my points. > that's my definition of mudslinging. > > anyway, i rarely mention david at all, > and probably shouldn't have gone on > after that initial post, but somebody > _did_ ask. (and i think that i was fair > by advising him to read david's blog > and make up his own mind about it.) > > -bowerbird > From Bowerbird at aol.com Mon Oct 2 09:55:02 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 2 09:55:08 2006 Subject: [gutvol-d] re: Scraping text from Univ Mich googles Message-ID: norm said: > Is there any way, other than a page by page scraping of the html > from the text images provided by UMich for their google books-- > to get the whole text in one file, or thereabouts. i've written a scraper-program, yeah. i'm reluctant to release it to the public, because in the hands of the wrong person, it could really piss off umichigan, to the point of making them reconsider their decision to release the o.c.r. text. but i'd be happy to send it to _you_, norm, and anyone who has proven by their actions that they're dedicated to the cause of e-books, and willing to do the work of digitizing books... for those who are interested in scraping text from umichigan, you might wanna read a series of posts i've been making to the bookpeople listserve, on digitizing umichigan o.c.r. text: > http://onlinebooks.library.upenn.edu/webbin/bparchive search for "feedback to umichigan" to find the series... > The other question is does re-OCR'ing the page images > give any better results than starting with the page texts > given by google? it just might. i haven't done any kind of inventory yet, but the o.c.r. text for the one book i'm doing for that series of posts is badly flawed. it's missing much info, including paragraphing, text-styling, text-indentation, and even the hyphens on the end-of-line hyphenates, so it's been an unnecessarily hard job to babysit the text. nonetheless, i'm still on-track to digitize the entire book in just one hour, and i've documented each task meticulously, so you can make up your own mind on how you'd proceed. > Have any of the other google participants > seen fit to privide the google text? not yet, as far as i know, but i hope they all will eventually. as you'll see from the umichigan text, however, we cannot count on the interface to be even minimally desirable, so it's gonna be necessary to scrape that content so that we can provide it to people in a form with acceptable usability. > Since DP may take a couple of years to process a book now, > doing your own thing may again be an option. i think the choice of one hour of work versus months of waiting for a book to come out of d.p. is a choice with stark perspective. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/797bd89f/attachment.html From hart at pglaf.org Mon Oct 2 09:59:34 2006 From: hart at pglaf.org (Michael Hart) Date: Mon Oct 2 09:59:36 2006 Subject: [gutvol-d] oh geez, part 2 In-Reply-To: References: Message-ID: Well, since I have some friends who already have one, and assure me that it is so much of a dog that you should be prepared we flea rememdies, I would have to go along with David Rothman in this instance, though I usually take the same precautions with him. Michael On Fri, 29 Sep 2006 Bowerbird@aol.com wrote: > david rothman is advising people > not to get caught up in the hype > over the "forthcoming" sony reader. > > your lesson is irony is over for today. > > -bowerbird > From gbnewby at pglaf.org Mon Oct 2 15:19:16 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Oct 2 15:19:18 2006 Subject: [gutvol-d] Make a video, get a thumper Message-ID: <20061002221916.GA20599@pglaf.org> The new Sun x4500 server (previously known as the "thumper") is a 24TB file server (twenty-four terabytes). They have a new offer out where those who make a video can win one. This would be ideal for PG to do a massive public collection of collections, metadata, etc. I can think of several ideas along that theme.... but there's close to zero chance I can make a video anytime soon. If you might be interested, take a look: http://sunflash.sun.com/articles/103/4/promos/17052 -- Greg From traverso at dm.unipi.it Tue Oct 3 02:33:56 2006 From: traverso at dm.unipi.it (Carlo Traverso) Date: Tue Oct 3 02:54:54 2006 Subject: [gutvol-d] Make a video, get a thumper In-Reply-To: <20061002221916.GA20599@pglaf.org> (message from Greg Newby on Mon, 2 Oct 2006 15:19:16 -0700) References: <20061002221916.GA20599@pglaf.org> Message-ID: <200610030933.k939XudT028369@posso9.dm.unipi.it> >>>>> "Greg" == Greg Newby writes: Greg> The new Sun x4500 server (previously known as the "thumper") Greg> is a 24TB file server (twenty-four terabytes). They have a Greg> new offer out where those who make a video can win one. Greg> This would be ideal for PG to do a massive public collection Greg> of collections, metadata, etc. I can think of several ideas Greg> along that theme.... but there's close to zero chance I can Greg> make a video anytime soon. Greg> If you might be interested, take a look: Greg> http://sunflash.sun.com/articles/103/4/promos/17052 Interesting, especially the 10-rack with 240TB for $470,995.00 The entry level, 12TB for $32,995.00 might be affordable for PG, with a fund-raising campaign supported by a clear use project. Sun itself might contribute with a substantial discount. Carlo From Bowerbird at aol.com Fri Oct 6 14:58:31 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 6 14:58:44 2006 Subject: [gutvol-d] re: Scraping text from Univ Mich googles Message-ID: <489.57ec8658.32582b87@aol.com> just wanted to give y'all an update on this... it ends up the o.c.r. text from the university of michigan is quite worthless, so bad there's no use in even scraping it... almost all of it is lacking quote-marks and em-dashes and the hyphens from end-of-line hyphenates, and paragraphs and text-styling and text-indentation too, so it's more work in most books to restore all that than to do the o.c.r. yourself. in fact, it'd probably be better to type a book from scratch than try to deal with this ugly o.c.r., because at least with a type-in, you can actually follow the narrative of the book. so i guess you'd have to say the umichigan o.c.r. is actually _worse_ than worthless. me, i'd be _embarrassed_ to post it in a public place, let alone offer it to a university community. but hey, maybe that's just _me_, know what i mean? so -- at least at this point in time -- michael hart was right that the google project wouldn't give us good digital text... of course, i was _also_ right, when i said that we should be willing to create good digital text ourselves, from the scans. and that still holds true... but -- at least so far -- i was wrong when i predicted that we would be given highly-proofed text from the project... so there you have it, michael. i was wrong. you were right. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061006/1b7980b8/attachment.html From Bowerbird at aol.com Sat Oct 7 11:43:37 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 11:43:48 2006 Subject: [gutvol-d] worse than worthless Message-ID: ok, when i said the umichigan o.c.r. is "worse than worthless", that was a rather unflattering description, wasn't it? yes it was. but judge for yourself whether it's fair, with these 2 verne works: > http://snowy.arsc.alaska.edu/bowerbird/misc/eighty.txt > http://snowy.arsc.alaska.edu/bowerbird/misc/china.txt the lost hyphenation and paragraphing can be restored automatically, in most cases, so doesn't have to entail _that_ much work (but some)... the lost quote-marks, however, are a _ton_ of work to reinstate. likewise the em-dashes (although there usually aren't too many) and text-styling and formatting (which vary from book-to-book). i'd think it's nearly impossible to write routines to automate all that. (i am not even slightly inclined to take it on as a difficult challenge.) moreover, since if you do the o.c.r. _correctly_, you can avoid all this unnecessary work, and since batch o.c.r. only takes a few minutes to set up, there's _no_ reason to waste your time with umichigan o.c.r. your o.c.r. program will pay for itself in no time, and you will be _considerably_ less frustrated, which is worth a lot all by itself... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/7a9dfc8c/attachment.html From Bowerbird at aol.com Sat Oct 7 12:18:37 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 12:18:43 2006 Subject: [gutvol-d] re: worse than worthless Message-ID: <54b.849850a.3259578d@aol.com> i forgot to mention that, in addition to all the missing data, the _recognition_ itself on the "80 days" book is atrocious... if you want a good laugh about how bad o.c.r. can get, that's one example that will give it to you... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/3044582c/attachment.html From Bowerbird at aol.com Sat Oct 7 12:50:10 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 12:50:17 2006 Subject: [gutvol-d] more amusement Message-ID: so i went to google to get the scans for the "chinaman" book by jules verne, and was amused to discover their pagenumbers are off by 2, which means that this page right here... > http://books.google.com/books?vid=LCCN01009859&id=-82QXfOrkwAC&pg=PP11& dq=%22The+tribulations+of+a+Chinaman+in+China%22&as_brr=1 has all the links that were meant for this contents page... > http://books.google.com/books?vid=LCCN01009859&id=-82QXfOrkwAC&pg=PP9& dq=%22The+tribulations+of+a+Chinaman+in+China%22&as_brr=1 and that, when you search for terms, the yellow highlighting is on the wrong page. this is the kind of comical b.s. you get when your filenames don't include pagenumbers at their core, and you end up tripping all over your "metadata pointers"... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/2551e3b5/attachment.html From Bowerbird at aol.com Sat Oct 7 14:31:06 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 14:31:14 2006 Subject: [gutvol-d] how do i get to this url? Message-ID: <54b.84ab814.3259769a@aol.com> why, when i asked for this url: > http://www.gutenberg.org/files/17903/17903-h/17903-h.htm am i directed to this page? > http://www.gutenberg.org/etext/17903 even when i ask for the above link from the overview page, i am recycled back to the overview page. i've noticed this same type of bug on other e-texts as well... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/82195bba/attachment.html From Bowerbird at aol.com Sat Oct 7 14:44:49 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 14:45:03 2006 Subject: [gutvol-d] how do i get to this url? Message-ID: i said: > why, when i asked for this url: i forgot to say the bug doesn't happen in all of my browsers, just my main one (camino, the newest version, under o.s.x.)... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/cf525030/attachment.html From nwolcott2ster at gmail.com Sat Oct 7 16:23:26 2006 From: nwolcott2ster at gmail.com (Norm Wolcott) Date: Sat Oct 7 16:24:37 2006 Subject: [gutvol-d] worse than worthless References: Message-ID: <007201c6ea67$9d2c6020$640fa8c0@atlanticbb.net> I would say that the China text is considerably better than the 80Days text. It apparently will vary from book to book. The 80 days book had a very narrow printed page, and so many hyphens which were lost. China does not seem to have many hyphenated lines. In both cases it is necessary to have the book available for good scans for final editing. I also may have lost something in the conversion from utf to iso without doing any converting. Also there seems to be much less conversation in this book, making restoring quote marks less of a challenge. nwolcott2@post.harvard.edu ----- Original Message ----- From: Bowerbird@aol.com To: gutvol-d@lists.pglaf.org ; Bowerbird@aol.com Sent: Saturday, October 07, 2006 2:43 PM Subject: [gutvol-d] worse than worthless ok, when i said the umichigan o.c.r. is "worse than worthless", that was a rather unflattering description, wasn't it? yes it was. but judge for yourself whether it's fair, with these 2 verne works: > http://snowy.arsc.alaska.edu/bowerbird/misc/eighty.txt > http://snowy.arsc.alaska.edu/bowerbird/misc/china.txt the lost hyphenation and paragraphing can be restored automatically, in most cases, so doesn't have to entail _that_ much work (but some)... the lost quote-marks, however, are a _ton_ of work to reinstate. likewise the em-dashes (although there usually aren't too many) and text-styling and formatting (which vary from book-to-book). i'd think it's nearly impossible to write routines to automate all that. (i am not even slightly inclined to take it on as a difficult challenge.) moreover, since if you do the o.c.r. _correctly_, you can avoid all this unnecessary work, and since batch o.c.r. only takes a few minutes to set up, there's _no_ reason to waste your time with umichigan o.c.r. your o.c.r. program will pay for itself in no time, and you will be _considerably_ less frustrated, which is worth a lot all by itself... -bowerbird ------------------------------------------------------------------------------ _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/86873699/attachment.html From Bowerbird at aol.com Sat Oct 7 18:27:53 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 7 18:28:02 2006 Subject: [gutvol-d] worse than worthless Message-ID: <389.cafd767.3259ae19@aol.com> norm said: > China does not seem to have many hyphenated lines. In both cases > it is necessary to have the book available for good scans for final editing. having spent time working with the o.c.r. from umichigan, i can assure you it will be faster for you to re-do the o.c.r. if you are bound and determined to use their text, though, i can send you a program that will automatically repair most of the end-line hyphenates, and restore much paragraphing. for the rest, though, you're pretty much on your own, sadly... > Also there seems to be much less conversation in this book, > making restoring quote marks less of a challenge. it might seem that way, but i predict you will find your error-rate is quite high, unacceptably high, not something that you would want to put your name on, not if you have any sense of pride... even if this is a rare edition, you won't get much honor by issuing a flawed digitization out into the world... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/0cdb20df/attachment.html From hyphen at hyphenologist.co.uk Sat Oct 7 23:45:03 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Sat Oct 7 23:45:17 2006 Subject: [gutvol-d] New Web site problem Message-ID: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost the link to the Project Gutenberg Upload Pages http://upload.pglaf.org. I searched long and hard but failed to find it :-(. May be there somewhere but I was forced back onto my copyright clearance email, to get there. The site may now be Wiki, but if everyone put links where they wanted, the whole site would rapidly become a mess. Perhaps someone who understands the layout of the new site could add it. -- Dave Fawthrop From marcello at perathoner.de Sun Oct 8 08:22:15 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Oct 8 08:22:20 2006 Subject: [gutvol-d] New Web site problem In-Reply-To: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> Message-ID: <452917A7.60006@perathoner.de> Dave Fawthrop wrote: > The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost > the link to the Project Gutenberg Upload Pages http://upload.pglaf.org. I > searched long and hard but failed to find it :-(. May be there somewhere > but I was forced back onto my copyright clearance email, to get there. > > The site may now be Wiki, but if everyone put links where they wanted, the > whole site would rapidly become a mess. Perhaps someone who understands > the layout of the new site could add it. If you really mean wikipedia, maybe you should contact them :-) There never was a link to upload.pglaf.org on the main page. Try this page: http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook -- Marcello Perathoner webmaster@gutenberg.org From Catenacci at Ieee.Org Sun Oct 8 11:18:41 2006 From: Catenacci at Ieee.Org (Onorio Catenacci) Date: Sun Oct 8 11:18:44 2006 Subject: [gutvol-d] New Web site problem In-Reply-To: <452917A7.60006@perathoner.de> References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> <452917A7.60006@perathoner.de> Message-ID: On 10/8/06, Marcello Perathoner wrote: > Dave Fawthrop wrote: > > > The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost > > the link to the Project Gutenberg Upload Pages http://upload.pglaf.org. I > > searched long and hard but failed to find it :-(. May be there somewhere > > but I was forced back onto my copyright clearance email, to get there. > > > > The site may now be Wiki, but if everyone put links where they wanted, the > > whole site would rapidly become a mess. Perhaps someone who understands > > the layout of the new site could add it. > > If you really mean wikipedia, maybe you should contact them :-) > > > There never was a link to upload.pglaf.org on the main page. > > Try this page: > > http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook > > > I was wondering why he brought up Wikipedia. :-) -- Onorio From hyphen at hyphenologist.co.uk Sun Oct 8 12:50:26 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Sun Oct 8 12:50:39 2006 Subject: [gutvol-d] New Web site problem In-Reply-To: <452917A7.60006@perathoner.de> References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> <452917A7.60006@perathoner.de> Message-ID: On Sun, 08 Oct 2006 17:22:15 +0200, Marcello Perathoner wrote: |Dave Fawthrop wrote: | |> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost |> the link to the Project Gutenberg Upload Pages http://upload.pglaf.org. I |> searched long and hard but failed to find it :-(. May be there somewhere |> but I was forced back onto my copyright clearance email, to get there. |> |> The site may now be Wiki, but if everyone put links where they wanted, the |> whole site would rapidly become a mess. Perhaps someone who understands |> the layout of the new site could add it. | |If you really mean wikipedia, maybe you should contact them :-) Oops copied the wrong URL |There never was a link to upload.pglaf.org on the main page. | |Try this page: | |http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook Yes but how do you get there from http://www.gutenberg.org/wiki/Main_Page Got it right this time ;-) Followed everything from http://www.gutenberg.org/wiki/Category:Volunteering and there is nothing there. -- Dave Fawthrop From marcello at perathoner.de Sun Oct 8 13:48:07 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Oct 8 13:48:12 2006 Subject: [gutvol-d] New Web site problem In-Reply-To: References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com> <452917A7.60006@perathoner.de> Message-ID: <45296407.1010502@perathoner.de> Dave Fawthrop wrote: > On Sun, 08 Oct 2006 17:22:15 +0200, Marcello Perathoner > wrote: > > |Dave Fawthrop wrote: > | > |> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost > |> the link to the Project Gutenberg Upload Pages http://upload.pglaf.org. I > |> searched long and hard but failed to find it :-(. May be there somewhere > |> but I was forced back onto my copyright clearance email, to get there. > |> > |> The site may now be Wiki, but if everyone put links where they wanted, the > |> whole site would rapidly become a mess. Perhaps someone who understands > |> the layout of the new site could add it. > | > |If you really mean wikipedia, maybe you should contact them :-) > > Oops copied the wrong URL > > |There never was a link to upload.pglaf.org on the main page. > | > |Try this page: > | > |http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook > > Yes but how do you get there from http://www.gutenberg.org/wiki/Main_Page > > Got it right this time ;-) > > Followed everything from > http://www.gutenberg.org/wiki/Category:Volunteering > and there is nothing there. Either use your browsers search function to search for "submit" on the main page, or enter "submit" into the "search site" box on the main page and click on "Search Site", or go to the How-To Category: http://www.gutenberg.org/wiki/Category:How-To -- Marcello Perathoner webmaster@gutenberg.org From schultzk at uni-trier.de Mon Oct 9 00:31:44 2006 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Mon Oct 9 00:31:50 2006 Subject: [gutvol-d] how do i get to this url? In-Reply-To: References: Message-ID: <676E337B-66AA-4D6F-A821-578397E97585@uni-trier.de> Hi, I just tried loading the page and it can up o.k. Kind of made the program sluggish till it completely loaded. Using: PowerBook G4 (1.5 GB 1,3 Ghz) Mac OSX 10.4.8 and Camino Version 2006091101 (1.0.3Int). regards Keith. Am 07.10.2006 um 23:44 schrieb Bowerbird@aol.com: > i said: > > why, when i asked for this url: > > i forgot to say the bug doesn't happen in all of my browsers, > just my main one (camino, the newest version, under o.s.x.)... > > -bowerbird > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061009/669f2cab/attachment.html From Bowerbird at aol.com Tue Oct 10 09:57:34 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 10 09:57:54 2006 Subject: [gutvol-d] credit-lines and generosity Message-ID: <537.8b7861a.325d2afe@aol.com> oh geez, some small-minded critics are giving josh grief on the "posted" listserve because he put his name on the "credits" line for the work he did on preparing and uploading some audio files to the library. i guess they think that work just happens magically. even if whitewashers normally work without credit, what does credit hurt? i strongly believe they deserve it, big-time. not that the credit-lines are all that vital -- i routinely strip them off the e-texts -- but my goodness, why be so _stingy_? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061010/e63cc565/attachment.html From joshua at hutchinson.net Tue Oct 10 11:31:55 2006 From: joshua at hutchinson.net (mailbox@hutchinson.net) Date: Tue Oct 10 11:54:24 2006 Subject: [gutvol-d] Audiobooks - Bibliographic record file listings Message-ID: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net> Howdy all! As I start to post audiobooks to the archive, I've started getting constructive criticism sent my way (thanks to those that have, btw!). One item is brought up over and over. Is there anyway to have the download links say which chapter the link is pointed to (or some other human identifiable information)? Here is an example of The Marvelous Land of Oz (http://www.gutenberg. org/etext/19466): Apple iTunes Audiobook none 738 KB main site mirror sites P2P Apple iTunes Audiobook none 1.98 MB main site mirror sites P2P Apple iTunes Audiobook none 3.20 MB main site mirror sites P2P Apple iTunes Audiobook none 2.13 MB main site mirror sites P2P Apple iTunes Audiobook none 1.63 MB main site mirror sites P2P Apple iTunes Audiobook none 2.65 MB main site mirror sites P2P Apple iTunes Audiobook none 3.20 MB main site mirror sites P2P Apple iTunes Audiobook none 2.63 MB main site mirror sites P2P Apple iTunes Audiobook none 2.64 MB main site mirror sites P2P Apple iTunes Audiobook none 2.68 MB main site mirror sites P2P Apple iTunes Audiobook none 2.49 MB main site mirror sites P2P Apple iTunes Audiobook none 3.11 MB main site mirror sites P2P Apple iTunes Audiobook none 3.22 MB main site mirror sites P2P Apple iTunes Audiobook none 2.74 MB main site mirror sites P2P Apple iTunes Audiobook none 2.41 MB main site mirror sites P2P Apple iTunes Audiobook none 2.53 MB main site mirror sites P2P Apple iTunes Audiobook none 2.33 MB main site mirror sites P2P Apple iTunes Audiobook none 2.27 MB main site mirror sites P2P Apple iTunes Audiobook none 4.69 MB main site mirror sites P2P Apple iTunes Audiobook none 2.77 MB main site mirror sites P2P Apple iTunes Audiobook none 3.95 MB main site mirror sites P2P Apple iTunes Audiobook none 2.36 MB main site mirror sites P2P Apple iTunes Audiobook none 1.80 MB main site mirror sites P2P Apple iTunes Audiobook none 3.18 MB main site mirror sites P2P Apple iTunes Audiobook none 2.26 MB main site mirror sites P2P That is the preface through chapter 24, but nothing on that page indicates which is which. Now, I realize that the bibrec page was not designed with audio books in mind, so this is in no way meant as an attack on anyone's efforts. Rather, this is a question on what can we/I do to make those pages more accessible to the end user. The text file contains a listing of the chapters, and I could create an HTML catalog with actual links (which I may do moving forward), but it doesn't help the layout of the current bibrec page which is ... well, a bit daunting to look at. Hmm, one idea just came to me, so shoot me down if this is stupid. What if the bibrec page did NOT show any of the audio files directly, but rather just the link to an HTML document. Then, when they click that, each chapter would be clearly labelled and linked to. Is this stupid? Is this doable? Josh From marcello at perathoner.de Tue Oct 10 13:58:30 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Oct 10 13:58:33 2006 Subject: [gutvol-d] Audiobooks - Bibliographic record file listings In-Reply-To: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net> References: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net> Message-ID: <452C0976.10909@perathoner.de> mailbox@hutchinson.net wrote: > Is there anyway to have the download links say which chapter the link > is pointed to (or some other human identifiable information)? Not with the current software. Before implementing file comments (for chapter headings etc.) on the bibrec page we need some standard way to pass them on to the automatic cataloger. That means: posting some sort of RDF file along with the files. > Now, I realize that the bibrec page was not designed with audio books > in mind, so this is in no way meant as an attack on anyone's > efforts. Rather, this is a question on what can we/I do to make those > pages more accessible to the end user. If you look at this page http://www.gutenberg.org/etext/9551 you'll see that the catalog groks two special file types: "readme" and "index" and sorts them to the top of the list. You can build a nicely formatted "index" file and post it along with your sound files. > The text file contains a listing of the chapters, and I could create > an HTML catalog with actual links (which I may do moving forward), > but it doesn't help the layout of the current bibrec page which is > ... well, a bit daunting to look at. > > Hmm, one idea just came to me, so shoot me down if this is stupid. > What if the bibrec page did NOT show any of the audio files directly, > but rather just the link to an HTML document. Then, when they click > that, each chapter would be clearly labelled and linked to. Is this > stupid? Is this doable? We can treat sound files the same way as image files which also don't show up. In this case, if you don't post an index file, the sound files will be accessible only through the apache directory listings. Another problem is that in the past we didn't post text and audio versions under the same etext number. Thus an etext no. can be declared AudioBook or not but not both. Also, people have asked for ways to download all sound files in one swoop. Maybe we should post a standard playlist format, so people can use their xampp / winampp to listen to the files? -- Marcello Perathoner webmaster@gutenberg.org From joshua at hutchinson.net Tue Oct 10 18:49:23 2006 From: joshua at hutchinson.net (mailbox@hutchinson.net) Date: Tue Oct 10 18:49:38 2006 Subject: [gutvol-d] Audiobooks - Bibliographic record file listings Message-ID: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net> >----Original Message---- >From: marcello@perathoner.de > >If you look at this page > > http://www.gutenberg.org/etext/9551 > >you'll see that the catalog groks two special file types: "readme" and >"index" and sorts them to the top of the list. You can build a nicely >formatted "index" file and post it along with your sound files. > Excellent. That looks good. Related question: How did the encoding field get populated for that one? Did a cataloger do that by hand? Is there something I should/can do for new audio postings? > >Also, people have asked for ways to download all sound files in one >swoop. Maybe we should post a standard playlist format, so people can >use their xampp / winampp to listen to the files? > Ah, yes, that is a good idea. Librivox uses the m3u playlist format for streaming... That'd probably be well appreciated. I'll see what I can put together on the next posting. Josh From marcello at perathoner.de Wed Oct 11 03:18:31 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Oct 11 03:18:36 2006 Subject: [gutvol-d] Audiobooks - Bibliographic record file listings In-Reply-To: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net> References: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net> Message-ID: <452CC4F7.8070600@perathoner.de> mailbox@hutchinson.net wrote: > Related question: How did the encoding field get populated for that > one? Did a cataloger do that by hand? Is there something I should/can > do for new audio postings? By hand. You don't need that unless you post same file type with different bitrates. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Wed Oct 11 10:51:36 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 11 10:51:46 2006 Subject: [gutvol-d] talking turkey Message-ID: <404.6b4518bf.325e8928@aol.com> well, it's 6 weeks until thanksgiving. last year, i told david rothman that i'd buy him a tofu turkey if the people doing the "hundred dollar laptop" actually had a machine for sale to us "ordinary folks" for $200 or less by this thanksgiving, a rumor that david reported around that time... in this blog entry: > http://www.teleread.org/blog/?p=3911 david had said, about the $200 quote: > The price figure is just speculation, > but it seems realistic to me. of course, what "seems realistic" to _david_ often seems to be totally unrealistic to me. in addition to the $200 laptop prediction, david said that "eventually" we would have a $50 computer. i pointed out, in a comment, that a computer that cost $100 to build would end up costing about $400 at retail, and david advised folks to "tune in a year from now, and we'll see who's right". that's when i told him i'd buy him a tofu turkey if his prediction held. i also told him that, if there was a real computer available for $50 within the next _five_ years, i would buy him one, which seems a safe bet, since in another 5 years, _lunch_ will cost $50, and i'd be happy to buy david lunch some time. well, sure enough, over the past year, the "hundred dollar laptop" has been rechristened (a number of times) to take the focus off the _price_ (which -- even in volumes of millions of units, -- doesn't seem to be quite obtainable), and rumors of retail sales to americans have been re-floated, but this time with a pricetag around $450 (with a tax-break write-off for your "donation" to charity). so much for whose prediction was right. someday the one-laptop-per-child project _will_ create a very cheap laptop, perhaps even one that can sell (in mass) for $100, and thus in units of 1 for as little as $200, but that day won't come in the next 6 weeks. so david, i guess you had better plan on buying your own tofu turkey for thanksgiving this year. so why am i telling gutvol-d all this? because i've been advising rothman all along to focus on the _reality_ instead of all the _hype_, so people start realizing e-books are here now, instead of around a corner that never gets turned, with one of the most solid e-book realities being the long and proud history of project gutenberg, which is now approaching its 20,000th e-text... so i'll be giving thanks this year for the volunteers -- from distributed proofreaders and elsewhere -- who have made this great cyberlibrary possible... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061011/6ef303e8/attachment.html From hart at pglaf.org Wed Oct 11 14:06:28 2006 From: hart at pglaf.org (Michael Hart) Date: Wed Oct 11 14:06:30 2006 Subject: [gutvol-d] talking turkey $100 Laptops In-Reply-To: <404.6b4518bf.325e8928@aol.com> References: <404.6b4518bf.325e8928@aol.com> Message-ID: >From what I understand, Libya has ordered a $100 laptop for each of its schoolchildren. . . . I'll try to forward the reference. mh From hart at pglaf.org Wed Oct 11 14:07:04 2006 From: hart at pglaf.org (Michael Hart) Date: Wed Oct 11 14:07:06 2006 Subject: [gutvol-d] $100 Laptop... (fwd) Message-ID: ---------- Forwarded message ---------- Date: Wed, 11 Oct 2006 12:32:39 -0700 (PDT) Subject: $100 Laptop... Lybia has just ordered $100 laptops for their 1.2 million school children. http://www.agoravox.com/article.php3?id_article=5235&id_forum=3267&var_mode=recalcul#forum This should put the production line for these bad boys on a very firm footing. Great stuff! From joey at joeysmith.com Wed Oct 11 14:16:08 2006 From: joey at joeysmith.com (joey) Date: Wed Oct 11 14:34:48 2006 Subject: [gutvol-d] talking turkey $100 Laptops In-Reply-To: References: <404.6b4518bf.325e8928@aol.com> Message-ID: <20061011211608.GA29634@joeysmith.com> On Wed, Oct 11, 2006 at 02:06:28PM -0700, Michael Hart wrote: > > >From what I understand, Libya has ordered a $100 laptop for each > of its schoolchildren. . . . > > I'll try to forward the reference. > > mh You're probably looking for http://www.nytimes.com/2006/10/11/world/africa/11laptop.html From Bowerbird at aol.com Wed Oct 11 15:06:16 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 11 15:06:24 2006 Subject: [gutvol-d] $100 Laptop... (fwd) Message-ID: michael said: > This should put the production line for these bad boys > on a very firm footing.? Great stuff! yes, it was this news that reminded me of the prediction that rothman had made last year; that's why i checked it... however, i think o.l.p.c. is waiting until they have orders for 5 million before they're proceeding. (i think they already had a few million ordered, so maybe this libyan order will get them over that hump.) and yes, i think it's _great_ that nations are now putting in their orders. i would think that -- given the _half-trillion_dollars_ that have evaporated in iraq -- it would be cost-effective for the united states to donate 5-10 million machines to various countries around the world, to get this project going, and polish our tarnished reputation. but, as you know, i'm not running the country. "the decider" is... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061011/f133006b/attachment.html From Bowerbird at aol.com Fri Oct 13 14:40:31 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 13 14:40:40 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <417.ce757e8.326161cf@aol.com> remember years back, on this very listserve, when we had a long-running series of threads on my "zen markup language", where detractors said it couldn't possibly provide sufficient detail to delineate all of the structures found in books, and i replied by making a list of those structures and formulating a test document containing them? > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml and remember how nobody ever came up with a structure that i had missed, indicating my list was sufficiently strong, inclusive, and exhaustive? well, jon noring -- better late than never -- is now running through the same exercise on his listserve. why? because he has this "idea" about creating an authoring-tool that would spit out various types of e-book formats, like .html and .lit and so on. and then david rothman writes up a teleblog entry about this "cool idea by jon noring" as if _nobody_ in the world had ever had it before, let alone already created such an authoring tool. and then robert nagel (of idiotprogrammer.com) comments that "wow, if only such a tool existed!" and the mutual hype society completes another cycle. i mean, i'm really _glad_ that jon has come to admit the importance of an authoring-tool. i've been telling him that he needed to recognize that for _years_ now. but hey, we're already in year 2006, soon to be 2007. if you're just now catching up to the "cool idea" of an _authoring-tool_, then you will need to speed up the process, especially if you are an "expert" in e-books, as jon is very fond of having himself described. so maybe someone should tell jon that i already have the authoring-tool thing covered, and he can move immediately to the next stage of his development, ok? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061013/7fd15cd3/attachment.html From jon at noring.name Fri Oct 13 16:23:39 2006 From: jon at noring.name (Jon Noring) Date: Fri Oct 13 16:23:48 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <417.ce757e8.326161cf@aol.com> References: <417.ce757e8.326161cf@aol.com> Message-ID: <1191732505.20061013172339@noring.name> Bowerbird wrote: > well, jon noring -- better late than never -- is now > running through the same exercise on his listserve. > > why?? because he has this "idea" about creating an > authoring-tool that would spit out various types of > e-book formats, like .html and .lit and so on. *laugh!!!* I knew you were going to bring this up here. The simple answer is that I've been talking about this since The eBook Community started in 1996. The Yahoo archive dates from mid-1999, and just do a search there, and you will find my comments. Try "authoring tool" for one search term, but this won't catch everything I've talked about on this topic. So your claim that I've only recently "seen the light" is simply historical revisionism. It's wonderful to have searchable archives. And the full ebook-list/TeBC archive will hopefully soon be on Google. > and then david rothman writes up a teleblog entry > about this "cool idea by jon noring" as if _nobody_ > in the world had ever had it before, let alone already > created such an authoring tool. Nope, no one has issued a fairly simple to use ebook authoring tool which exports into all the common ebook formats today, and into any ebook format envisioned for the future. And this includes you, Bowerbird -- you are not there yet. Call or email me when you are finished and ready to market it to small publishers. Of course, they'll ask for high-quality PDF output which they control all formatting and the quality of typesetting is at least as good as Word, LIT, Mobipocket, RTF, OEBPS, OpenReader, the various flavors of Palm formats (can't keep them straight), etc. Funny thing, though, publishers will not ask for plain text. > i mean, i'm really _glad_ that jon has come to admit > the importance of an authoring-tool.? i've been telling > him that he needed to recognize that for _years_ now. Wow! I'd never have known. > so maybe someone should tell jon that i already have > the authoring-tool thing covered, and he can move > immediately to the next stage of his development, ok? I look forward to your producing a publisher-ready version of your tool which will export into all the common ebook formats of today and the foreseeable future. The world needs one! Here's your chance for glory (which I know you are not seeking, humble you.) Btw, do you plan to open source your tool? And if not, why not? Jon Noring (p.s., Bowerbird, have you approached small ebook publishers with your tool to see if they will embrace it?) From lee at novomail.net Fri Oct 13 14:48:49 2006 From: lee at novomail.net (Lee Passey) Date: Fri Oct 13 16:27:23 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <417.ce757e8.326161cf@aol.com> References: <417.ce757e8.326161cf@aol.com> Message-ID: <453009C1.202@novomail.net> Bowerbird@aol.com wrote: [snip] > so maybe someone should tell jon that i already have > the authoring-tool thing covered, and he can move > immediately to the next stage of his development, ok? > > -bowerbird "Said the pieman to Simple Simon, 'Show me first your penny.'" -- Nothing of significance below this line. From Bowerbird at aol.com Fri Oct 13 17:12:21 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 13 17:12:28 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <57c.6c45766.32618565@aol.com> if anyone wants a plain-text to (x)html authoring tool, i suggest they check out "markdown" for starters... > http://daringfireball.net/projects/markdown/ you can play around with it using its "dingus": > http://daringfireball.net/projects/markdown/dingus my own interest is in _disintermediating_ publishers, not marketing my programs to them. have a nice weekend... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061013/eec6eeb6/attachment.html From jon at noring.name Fri Oct 13 17:19:10 2006 From: jon at noring.name (Jon Noring) Date: Fri Oct 13 17:25:51 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <57c.6c45766.32618565@aol.com> References: <57c.6c45766.32618565@aol.com> Message-ID: <7810100940.20061013181910@noring.name> Bowerbird wrote: > if anyone wants a plain-text to (x)html authoring tool, > i suggest they check out "markdown" for starters... > >>?? http://daringfireball.net/projects/markdown/ > > you can play around with it using its "dingus": >>?? http://daringfireball.net/projects/markdown/dingus Excellent reference. I've been monitoring the 'markdown' mailing list for some time now. Interesting discussions there... > my own interest is in _disintermediating_ publishers, > not marketing my programs to them. Well, now that we got your position on the matter... Jon From Bowerbird at aol.com Sat Oct 14 00:38:39 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 14 00:38:56 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: i said: > ? my own interest is in _disintermediating_ publishers which, by the way, is why i remain staunchly anti-d.r.m. some "experts" want to get "buy-in" from the publishers, so they are willing to do something as stupid as putting _locks_ on books (so as to turn 'em into cash-registers). what a supremely stupid attitude. one beauty of electronic-books and cyberspace is that we can _free_ ourselves from the shackles placed on us by the greedy-rich-boy middlemen who now siphon off a boatload of the cash between artists and the audience, and desperately wish to maintain their "business model". why in the world you'd want to get "buy-in" from these thieves is totally beyond me. and that's _my_ position... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061014/ac8c0997/attachment.html From Bowerbird at aol.com Sat Oct 14 15:56:01 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 14 15:56:11 2006 Subject: [gutvol-d] re: oh geez, part 3 Message-ID: <457.6871ecdd.3262c501@aol.com> it occurs to me that there is a need to state the relevance to gutvol-d... we want to be able to create books in the formats people want them... as most of you already know well, david moynihan over at blackmask -- whom i _support_ in his lawsuit, unlike many fair-weather friends -- managed to covert the plain-text e-texts from project gutenberg to a wide variety of e-book formats, and he did it _automatically_ using scripts that he developed himself... in contrast, the xslt workflow that many posit as the mechanism that turns x.m.l. files into various formats still hasn't been developed or proven. in a nutshell, conversions are not hard, not for me. might be hard for others, but they're not hard for me. that is all... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061014/70bcee4e/attachment.html From Bowerbird at aol.com Sun Oct 15 14:21:39 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Oct 15 14:21:50 2006 Subject: [gutvol-d] online editing of documents, collaboratively Message-ID: <46c.2324ee45.32640063@aol.com> of course, the future of online documents is already here, what with the arrival of web-apps to do word-processing, which allows long-distance collaboration between people. for instance, that test-suite that i pointed to just recently? > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml here it is in an incarnation using google's online tool: > http://docs.google.com/View?docid=dgczchnc_0d9c9b6 to experiment, i re-did all the styling, links, etc., from scratch, which was quite unnecessarily painful for me, since i am now accustomed to the automatic formatting done by my tools, but i'm confident that google will eventually catch up to me... :+) but notice that you can upload a document in various formats, and i would assume that styling, links, etc., would be retained... conversely, also notice that once a document is up, google handles conversions to other formats, i.e., .html, .pdf, .rtf, .doc, and ascii... i'm guessing that amazon's version of this tool will include a routine that also converts into mobipocket format, wouldn't you think? :+) anyway, this is how digitization efforts should be done in 2007, not the clunky markup-based way that distributed proofreaders is settling on for its workflow. -bowerbird p.s. thanks to the spellcheck feature of google's online tool, i found a typo in my file. darn, it's totally amazing how those things creep in! p.p.s. if you're allergic to google, this tool is getting good reviews: > http://www.zohowriter.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061015/724b201b/attachment.html From hyphen at hyphenologist.co.uk Sun Oct 15 23:58:17 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Mon Oct 16 00:05:08 2006 Subject: [gutvol-d] online editing of documents, collaboratively In-Reply-To: <46c.2324ee45.32640063@aol.com> References: <46c.2324ee45.32640063@aol.com> Message-ID: On Sun, 15 Oct 2006 17:21:39 EDT, Bowerbird@aol.com wrote: I had a quick look at: |> http://docs.google.com/View?docid=dgczchnc_0d9c9b6 and found >> chapter 13 -- unlucky 13 >> there is no 13th floor in most buildings. Not true over most of the world. Just a *silly* USian cultural Oddity. This suggests that Chapter 13 in PG Books be renamed somehow, which is *bad*. -- Dave Fawthrop From traverso at dm.unipi.it Mon Oct 16 00:31:22 2006 From: traverso at dm.unipi.it (Carlo Traverso) Date: Mon Oct 16 00:29:36 2006 Subject: [gutvol-d] online editing of documents, collaboratively In-Reply-To: (message from Dave Fawthrop on Mon, 16 Oct 2006 07:58:17 +0100) References: <46c.2324ee45.32640063@aol.com> Message-ID: <200610160731.k9G7VMD14905@pico.dm.unipi.it> >>>>> "Dave" == Dave Fawthrop writes: Dave> On Sun, 15 Oct 2006 17:21:39 EDT, Bowerbird@aol.com wrote: I Dave> had a quick look at: |> Dave> http://docs.google.com/View?docid=dgczchnc_0d9c9b6 Dave> and found >>> chapter 13 -- unlucky 13 there is no 13th floor in most >>> buildings. Dave> Not true over most of the world. Just a *silly* USian Dave> cultural Oddity. I disagree. In most of the word there is no 13th floor in most buildings: most buildings have less than 13 floors. (and if they have, there is a 13th floor, even if you name it differently, or if you don't name it at all) Carlo From joey at joeysmith.com Mon Oct 16 02:25:03 2006 From: joey at joeysmith.com (joey) Date: Mon Oct 16 02:31:27 2006 Subject: [gutvol-d] re: oh geez, part 3 In-Reply-To: <457.6871ecdd.3262c501@aol.com> References: <457.6871ecdd.3262c501@aol.com> Message-ID: <20061016092503.GB29634@joeysmith.com> On Sat, Oct 14, 2006 at 06:56:01PM -0400, Bowerbird@aol.com wrote: > in contrast, the xslt workflow that > many posit as the mechanism that > turns x.m.l. files into various formats > still hasn't been developed or proven. Simply not true -- not for publishing in general, and not for PG-related projects specifically. I use XSLT to publish to HTML, PDF, plain text, and OASIS Open Document Format [among others] from docbook and similar formats on a daily basis. Additionally, I have previously shown XSLT stylesheets on gutvol-p that took some XML provided by Greg (a bunch of Dickens works) which output HTML and plain text. I stopped short of the PDF at the time because I was not the only person working on the project, and it seemed to me that some of the others were further along than I. Additionally, I take exception to your assertion earlier in this list: > and remember how nobody ever came up with a > structure that i had missed, indicating my list was > sufficiently strong, inclusive, and exhaustive? I chose not to indulge your mania further. That does not mean I never came up with a structure that you had missed. In fact, I found it trivial to come up with such. In fact, this is classical logical fallacy, known commonly as "Argument from ignorance"...that is, "a premise is true only because it has not been proven false". All of that aside, please stop trying to import arguments from other fora into this one. If I wanted to know the latest on what Jon or David or the Teleblog community had to say on matters, I would seek it from them - not indirectly from a known detractor to their cause via the PG mailing lists. You have your own blog, please use that to stump for yourself. From prosfilaes at gmail.com Mon Oct 16 04:13:07 2006 From: prosfilaes at gmail.com (David Starner) Date: Mon Oct 16 04:13:11 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <417.ce757e8.326161cf@aol.com> References: <417.ce757e8.326161cf@aol.com> Message-ID: <6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com> On 10/13/06, Bowerbird@aol.com wrote: > and remember how nobody ever came up with a > structure that i had missed, indicating my list was > sufficiently strong, inclusive, and exhaustive? I'm curious whether I had you in my kill-file by then (which I still do, but gmail makes it too easy to drag out killed messages), or if this is just a deeply skewed memory. I suspect the latter, since your test document says "there aren't a whole lot of tables in the e-texts -- we're talking literature, not spreadsheets -- but your system should handle tables anyway; not really big and hairy ones, just simple ones", and http://www.pgdp.net/phpBB2/viewtopic.php?t=4311 shows some really big and hairy tables found in real PG etexts. The test document certainly shows no evidence of the arbitrary evilness the Early English Text Society and friends saw fit to hand us; heck, it doesn't even show how to handle sidenotes, those things ubiquitous in pre-18th century printing. Not to mention math. Why am I even bothering to try and prove that statement comes out of your own little world? From bill at williamtozier.com Mon Oct 16 05:14:07 2006 From: bill at williamtozier.com (William Tozier) Date: Mon Oct 16 05:21:00 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com> References: <417.ce757e8.326161cf@aol.com> <6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com> Message-ID: <45930876-D41E-400A-AEB1-A87F57FF613E@williamtozier.com> On Oct 16, 2006, at 7:13 AM, David Starner wrote: > Why am I even > bothering to try and prove that statement comes out of your own little > world? We all lapse, now and then. You beat me to it this time by a matter of minutes. Your backslide keeps me from jumping in as well. :) Altruism. ----- Bill Tozier AIM: vaguery@mac.com blog: http://williamtozier.com/slurry plazes: http://beta.plazes.com/user/BillTozier skype: vaguery "Nature, however picturesque, never yet made a poet of a dullard." --Hjalmar Hjorth Boyesen From Bowerbird at aol.com Mon Oct 16 09:21:46 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 09:22:05 2006 Subject: [gutvol-d] re: oh geez, part 3 Message-ID: joey said: > I have previously shown XSLT stylesheets on gutvol-p that > took some XML provided by Greg (a bunch of Dickens works) > which output HTML and plain text. let's see that work joey. > I chose not to indulge your mania further. That does not mean > I never came up with a structure that you had missed. let's hear them, joey. > please stop trying to import arguments from other fora i've stated the relevance each time. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/3d7d868e/attachment.html From Bowerbird at aol.com Mon Oct 16 10:04:15 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 10:05:08 2006 Subject: [gutvol-d] online editing of documents, collaboratively Message-ID: <595.444e5f55.3265158f@aol.com> dave said: > Just a *silly* USian cultural Oddity. yeah right. i was thinking we'd made it through the recent friday-the-13th when -- rumble! -- earthquake rocks hawaii. unrelated, you say? sure it's "unrelated". > This suggests that Chapter 13 in PG Books > be renamed somehow, which is *bad*. ok, i withdraw that "suggestion"... ;+) *** carlo said: > In most of the word there is no 13th floor in most buildings: > most buildings have less than 13 floors. bingo! > (and if they have, there is a 13th floor, even if you > name it differently, or if you don't name it at all) semantics! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/7936fd96/attachment.html From Bowerbird at aol.com Mon Oct 16 10:12:34 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 10:12:38 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: david said: > your test document says "there aren't a whole lot of tables in the e-texts > -- we're talking literature, not spreadsheets -- but your system should > handle tables anyway; not really big and hairy ones, just simple ones", > and http://www.pgdp.net/phpBB2/viewtopic.php?t=4311 > shows some really big and hairy tables found in real PG etexts. surely you don't think that pulling one e-text out of the whole library -- or even 100 of them! -- negates that general statement, do you? even _600_ exceptions would be just 3% of the ~20,000 e-texts now. but let's get down to brass tacks, shall we? in saying that "your system should be able to handle simple tables", i'm just laying out a minimum requirement that would suffice here, for the other people that might want to develop their system for p.g. my own system will eventually be able to handle quite complex tables, when i find the need to develop it that far. and if you'd like some proof, then hand me a list of 100 e-texts that use tables, and i will tackle them first when the time for "attacking tables" comes up big on my agenda... (and leave out the spalding baseball guides, i already know about them.) in the meantime, if you think "tables" is something that you can point to as a "shortcoming" in my list, then you really need to rethink. i am quite well aware of tables, and even included them in my test-suite, thank you. > The test document certainly shows no evidence of the arbitrary evilness > the Early English Text Society and friends saw fit to hand us; arbitrariness? if i want to point to _arbitrariness_, i will point to the inconsistencies in the production of the e-texts themselves, which are riddled with inconsistencies. i don't need to point to work from artisans of the previous century, or the century before, work that was done _manually_, for the most part, and not aided by computers that should help make things much more uniform. you want "arbitrary" today? take a good look at the .html versions that have come out of distributed proofreaders over the last 6 years. it's just a shame that all of the hard work that went into making 'em is going to have to be tossed out, regretfully, when future digitizers conclude that it's simply _easier_ to re-do the work -- from scratch -- than to try to puzzle out the unique make-up of each of those files... (and yes, david, i do know that you have been one of the voices over on the d.p. boards in favor of greater standardization of the .html, so i salute you for taking that stand there; you are on the right side.) > heck, it doesn't even show how to handle sidenotes, > those things ubiquitous in pre-18th century printing. some sidenotes are essentially headings, and thus should be treated in that manner. others are annotations, and should be treated as such. it is only your carelessness that now lumps both of these cases together. really, david, if you want to "prove" something or other, you're going to have to work much harder than this. do you really think that i've thought about it and examined literally thousands of books, and not encountered some "sidenotes" on one occasion or another? do you really believe that i haven't rolled my eyes time after time when sidenotes were "discussed" on the d.p. boards? (ditto for "small caps" markup in the last 6 months; markup has done to you guys what it always does to people, which is to get them embroiled in minutia such that they badly lose the big picture.) > Not to mention math. oh david, you've mentioned "math" many times. over and over again, david. and over and over again, i've replied that i'll handle equations as graphics. eventually, if there is a compelling need, i might even adapt one of the existing plain-text solutions for rendering graphics (tex, anyone?) to do the job. but i doubt there will ever be such a "compelling need" to pull math equations out of books that are, in most cases, 80+ years old. (and david, usually you mention "music" along with math. how come you didn't do that this time? perhaps you've forgotten the drill, man.) > Why am I even bothering to try and prove that statement > comes out of your own little world? yes, david, why are you even bothering to _try_ and do something that you are so clearly incapable of doing? something that you have _never_ been able to do before? the only way you're going to "defeat" me is to put me back into your "kill" file. stick your head in the ground, ostrich... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/eb15df6d/attachment-0001.html From Bowerbird at aol.com Mon Oct 16 11:36:17 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 11:36:23 2006 Subject: [gutvol-d] any more questions, lee? Message-ID: <3b3.9849ea8.32652b21@aol.com> lee said: > "Said the pieman to Simple Simon, > 'Show me first your penny.'" so lee, do you have any more questions about simple-simon authoring-tools? if you do, i'll be happy to answer 'em... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/79794354/attachment.html From cannona at fireantproductions.com Mon Oct 16 11:37:18 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Mon Oct 16 11:37:39 2006 Subject: [gutvol-d] oh geez, part 3 References: Message-ID: <001d01c6f152$279e35d0$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bowerbird wrote: > oh david, you've mentioned "math" many times. over and over again, > david. > > and over and over again, i've replied that i'll handle equations as > graphics. Doesn't sound terribly accessible. > > eventually, if there is a compelling need, i might even adapt one of the > existing plain-text solutions for rendering graphics (tex, anyone?) to > do the job. but i doubt there will ever be such a "compelling need" to > pull math equations out of books that are, in most cases, 80+ years old. For that matter, why would we even care about crappy old books in the first place. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFM9FyI7J99hVZuJcRAtBqAKDwQRMmqqOeCtyYa3S0VK/f18AkNwCgrkzP /sqIqFvhWXkoQjC7UESuvAM= =ro11 -----END PGP SIGNATURE----- From prosfilaes at gmail.com Mon Oct 16 12:04:09 2006 From: prosfilaes at gmail.com (David Starner) Date: Mon Oct 16 12:04:12 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: References: Message-ID: <6d99d1fd0610161204q100ba7cfj7559963b4b7b38f6@mail.gmail.com> On 10/16/06, Bowerbird@aol.com wrote: > even _600_ exceptions would be just 3% of the ~20,000 e-texts now. You claimed that your list was "strong, inclusive, and exhaustive", but only handle 97% of the texts? There's very few fields where a 97% success rate is considered good enough, outside head-to-head competitions; usually, if a 97% success rate is tolerated, there's research to make it better, but it's a hard enough problem that 97% is the best possible now. That's not the case here. > my own system will eventually be able to handle quite complex tables, > when i find the need to develop it that far. So your system doesn't currently handle everything necessary. > > The test document certainly shows no evidence of the arbitrary evilness > > the Early English Text Society and friends saw fit to hand us; > > arbitrariness? So you avoid the point; that your test-suite doesn't consist of real life problems and isn't nearly as painful as the real life problems I see. > > heck, it doesn't even show how to handle sidenotes, > > those things ubiquitous in pre-18th century printing. > > some sidenotes are essentially headings, and thus should be treated > in that manner. others are annotations, and should be treated as such. > it is only your carelessness that now lumps both of these cases together. That's an editor's job. I'm not an editor; I merely want to reproduce the text as is. > > Not to mention math. > > oh david, you've mentioned "math" many times. over and over again, david. > > and over and over again, i've replied that i'll handle equations as > graphics. Oh, wonderful. Let's reproduce the tragic failures of equation typesetting, and add the problem of the font used for the text and font used for the equations have no similarity. I scan math books frequently to get a more legible copy, not something that preserves all the failures of the original typesetting. > eventually, if there is a compelling need, i might even adapt one of the > existing plain-text solutions for rendering graphics (tex, anyone?) to > do the job. but i doubt there will ever be such a "compelling need" to > pull math equations out of books that are, in most cases, 80+ years old. The way to sway people to use your programs is not to dismiss the things they consider as compelling as unimportant. > yes, david, why are you even bothering to _try_ and do > something that you are so clearly incapable of doing? When you tout a solution that doesn't fix our problems and wonder why we don't flock to it, I'd say you're in your own little world. From hyphen at hyphenologist.co.uk Mon Oct 16 12:06:28 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Mon Oct 16 12:06:48 2006 Subject: [gutvol-d] online editing of documents, collaboratively In-Reply-To: <595.444e5f55.3265158f@aol.com> References: <595.444e5f55.3265158f@aol.com> Message-ID: On Mon, 16 Oct 2006 13:04:15 EDT, Bowerbird@aol.com wrote: |dave said: |> Just a *silly* USian cultural Oddity. | |yeah right. | |i was thinking we'd made it through |the recent friday-the-13th when -- |rumble! -- earthquake rocks hawaii. | |unrelated, you say? sure it's "unrelated". Our old friend chance. -- Dave Fawthrop From Bowerbird at aol.com Mon Oct 16 12:12:06 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 12:12:18 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: aaron said: > Doesn't sound terribly accessible. "if there is a compelling need..." (but what do you suggest as being more "accessible"?) > For that matter, why would we even care about > crappy old books in the first place. seems like a rather silly position for you to adopt, aaron. and _quite_ a risky one to be trying to put in _my_ mouth. classic literature doesn't "get dated", which is why the "crappy old books" that contain it deserve our attention. _math_ books, on the other hand, and most especially their _equations_, don't fall in quite the same category. either those equations have become "classic" themselves, in which case there is little need to do anything more than present them as illustrations, or they have become _dated_ (in light of further developments), in which case there is no need to do anything more than present them as illustrations. did you notice the symmetry there? at any rate, i am sure that math people have developed ways to share their work with each other via plain-text e-mail, and -- if the need arises -- i will hear from them how they do that, and incorporate those conventions into zen markup language. meanwhile, for the 99.7% of the project gutenberg library which currently has no need _at_all_ (let alone any _compelling_ need) for math equations, i don't have to worry about them, thank you. *** oh, and by the way, it is the very fact that _these_ are the kind of "objections" that are made to my list of structures that informs me that that list is sufficiently complete i don't have to worry about it. if you guys had anything _substantive_ to say, you certainly would, and i would simply say "thanks" and add it to my list of structures... as it is, you've now had _years_ to think about it and scour books to try and find something that falls outside my list, and you've got zilch. and hey, here's a quick piece of advice you might consider: when you've got _zilch_, the best thing to do is stay silent... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/a5012c66/attachment.html From joshua at hutchinson.net Mon Oct 16 12:26:13 2006 From: joshua at hutchinson.net (mailbox@hutchinson.net) Date: Mon Oct 16 12:26:24 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <11633580.1161026773348.JavaMail.?@fh1064.dia.cp.net> ----Original Message---- From: Bowerbird@aol.com > and hey, here's a quick piece of advice you might consider: > when you've got _zilch_, the best thing to do is stay silent... *** And yet you keep making noise... Josh PS Just because you dismiss everyone arguments, doesn't mean they aren't valid. It just makes you look like a small child that has his fingers in his ears, yelling, "La-la-la-la. I can't hear you! La-la- la-la-laaaaa!" From Bowerbird at aol.com Mon Oct 16 12:31:43 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 12:31:54 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: david said: > You claimed that your list was > "strong, inclusive, and exhaustive", > but only handle 97% of the texts? 97% on the _first_pass_, david. i ain't done yet. how much is the "official" .tei handling right now? > So you avoid the point; that your test-suite > doesn't consist of real life problems and isn't > nearly as painful as the real life problems I see. well, as soon as we've got some systems that can pass _this_ test-suite, then we can start making it tougher with some of the features that are rare in the library. i have repeatedly suggested that the x.m.l. advocates and the p.g.t.e.i. freaks should show us how they would mark up this test-suite and convert it to various formats. but i have had no takers... as you noted, 97% _is_ good if it's head-to-head against something that's not as high. and so far anyway, it's like my 97% against .tei vapor... > That's an editor's job. I'm not an editor; > I merely want to reproduce the text as is. yes, i _am_ an editor. any time you are creating a new version -- and a digitized version of a paper-book _is_ a new version -- you'd better be prepared to be an editor. i don't see any point in "reproducing the text." if i wanna see what the paper-book looked like, i much prefer to just go take a look at the scans. what i want to create is something that _works_well_ as an electronic-book, not that mimics a paper-book. > I scan math books frequently to get a more legible copy, > not something that preserves all the failures of > the original typesetting. you're being inconsistent. do you preserve the original, or not? > The way to sway people to use your programs is > not to dismiss the things they consider as compelling > as unimportant. um, sorry, you're not important enough for me to "sway". i'm just putting myself on the record, so i can return years from now and say "i told you so". believe whatever you will. > When you tout a solution that doesn't fix our problems > and wonder why we don't flock to it, > I'd say you're in your own little world. and i am happy to be in "my own little world" instead of yours. it would be quite disconcerting to me to have a bunch of idiots suddenly "flock" into my world; i'd have to rethink _everything_. -bowerbird p.s. and i _still_ haven't added anything to my list of structures... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/3ebcdbfe/attachment.html From Bowerbird at aol.com Mon Oct 16 12:34:28 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 12:34:33 2006 Subject: [gutvol-d] online editing of documents, collaboratively Message-ID: meanwhile, does anyone have anything substantial to say concerning the online editing of documents, collaboratively? or is this another great tool we're gonna pretend doesn't exist? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/8a0fa572/attachment.html From Bowerbird at aol.com Mon Oct 16 12:38:07 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 12:38:18 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: josh said: > Just because you dismiss everyone arguments, > doesn't mean they aren't valid.? It just makes you > look like a small child that has his fingers in his ears, > yelling, "La-la-la-la.? I can't hear you!? La-la-la-la-laaaaa!" josh's post = troll. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/9defe4e7/attachment.html From jon at noring.name Mon Oct 16 11:50:48 2006 From: jon at noring.name (Jon Noring) Date: Mon Oct 16 12:49:34 2006 Subject: [gutvol-d] Content MathML (was "oh geez, part 3") In-Reply-To: <001d01c6f152$279e35d0$0300a8c0@blackbox> References: <001d01c6f152$279e35d0$0300a8c0@blackbox> Message-ID: <1269046842.20061016125048@noring.name> Aaron wrote: > Bowerbird wrote: >> oh david, you've mentioned "math" many times. over and over again, >> david. >> >> and over and over again, i've replied that i'll handle equations as >> graphics. > Doesn't sound terribly accessible. Yep. "The blind be damned". >> eventually, if there is a compelling need, i might even adapt one of the >> existing plain-text solutions for rendering graphics (tex, anyone?) to >> do the job. but i doubt there will ever be such a "compelling need" to >> pull math equations out of books that are, in most cases, 80+ years old. > For that matter, why would we even care about crappy old books in the first > place. Aaron brings up a good point that there is contemporary content being produced. Even if the goal is to disintermediate publishers, one still has to handle mathematical equations in a way which benefits users. This is what is intriguing about the content flavor of MathML, where many (but not all) mathematical expressions can be made "understandable" by mathematics software. This allows the ebook containing such markup to be able to directly call such programs for plotting, solving, etc. The introduction of this chapter about content MathML in the MathML spec is excellent: http://www.w3.org/TR/MathML2/chapter4.html So, how would ZML handle semantic MathML markup? It is, of course in "evil XML", so would that not be allowed in ZML? Jon Noring From joshua at hutchinson.net Mon Oct 16 12:50:08 2006 From: joshua at hutchinson.net (mailbox@hutchinson.net) Date: Mon Oct 16 12:50:11 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <27903894.1161028208906.JavaMail.?@fh1064.dia.cp.net> ----Original Message---- From: Bowerbird@aol.com >> david said: >> You claimed that your list was >> "strong, inclusive, and exhaustive", >> but only handle 97% of the texts? > > 97% on the _first_pass_, david. i ain't done yet. > how much is the "official" .tei handling right now? Everything you've mentioned AND everything David mentioned (there might be something in the Early English stuff he mentioned it can't do, though at least one of them has been done in PGTEI). > i have repeatedly suggested that the x.m.l. advocates > and the p.g.t.e.i. freaks should show us how they would > mark up this test-suite and convert it to various formats. > but i have had no takers... Well, I don't know what "test-suite" of texts you refer to, but honestly, I usually ignore most of your pointless ramblings because I'm too busy actually DOING something. You know, like posting close to 100 books in TEI format to the PG archives. (And before anyone calls me on it, *I* haven't posted that many; I'm including other people's efforts on TEI in that "close to 100". There have been 3 other people that have posted books in TEI format that I know of.) Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/1e5aec35/attachment.html From Bowerbird at aol.com Mon Oct 16 13:00:50 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 13:00:59 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: josh said: > Well, I don't know what "test-suite" of texts you refer to, but > honestly, I usually ignore most of your pointless ramblings there's a frank admission, folks. he doesn't even know what's being discussed, but he feels qualified to throw in a few insults. speaks for itself, that post does, and speaks volumes. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/2f2f0ee4/attachment.html From Bowerbird at aol.com Mon Oct 16 13:12:56 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 13:13:08 2006 Subject: [gutvol-d] oh geez, part Message-ID: and since we've degenerated to the troll level, let me just cut to the chase at the finale, ok? ultimately, for music, i will probably do exactly what d.p. has done, and use lilypond or finale, and route that file to either an external player or one that i have embedded in my viewer-app. unlike music-markup-language, lilypond shares my core philosophy of simplicity and elegance... i'll follow the same approach for math equations, routing them to an equation editor that is either (a) an external app, or (b) embedded in my viewer. i'd guess it will probably be tex-based rather than math-markup-language, as tex is widely preferred, and expressible in utf-8. (math-markup-language is also expressible in utf-8, but it's also got all that angle-bracket gunk in it, which i'm badly allergic to.) so, as usual, it's simple as pie for me to "answer" your "objections", i just wanna see how desperate you get. and none of this needs to go in my test-suite yet. (but, so you know, i've already got .mp3 support, and .aiff and a bunch of other music formats, and i'm guessing that quicktime will support .svg soon, if it doesn't already, which can be used for equations, so none of this is the problem it's been made out as.) of course, anyone else is free to make their _own_ test-suite, any time they want. like i told ya earlier, jon noring is looking for just this type of feedback... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/f36fa206/attachment.html From traverso at dm.unipi.it Mon Oct 16 13:23:02 2006 From: traverso at dm.unipi.it (Carlo Traverso) Date: Mon Oct 16 13:21:11 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: (Bowerbird@aol.com) References: Message-ID: <200610162023.k9GKN2N19327@pico.dm.unipi.it> >>>>> "Bowerbird" == Bowerbird writes: Bowerbird> classic literature doesn't "get dated", which is why Bowerbird> the "crappy old books" that contain it deserve our Bowerbird> attention. Bowerbird> _math_ books, on the other hand, and most especially Bowerbird> their _equations_, don't fall in quite the same Bowerbird> category. Bowerbird> either those equations have become "classic" Bowerbird> themselves, in which case there is little need to do Bowerbird> anything more than present them as illustrations, or Bowerbird> they have become _dated_ (in light of further Bowerbird> developments), in which case there is no need to do Bowerbird> anything more than present them as illustrations. did Bowerbird> you notice the symmetry there? Completely false. A lot of contemporary math research "rediscovers" XIXth and early XXth century works that have been forgot for 70-100 years or more, and restarts from them. Bowerbird> at any rate, i am sure that math people have developed Bowerbird> ways to share their work with each other via plain-text Bowerbird> e-mail, and -- if the need arises -- i will hear from Bowerbird> them how they do that, and incorporate those Bowerbird> conventions into zen markup language. Sure we do. We use TeX (or pseudo-TeX fragments). {-b\pm\sqrt{b^2-4ac}}\over{2a} for the solutions of a quadratic equation ax^2+bx+c=0. And if you can read the formula, you can read its TeX form. Carlo From lee at novomail.net Mon Oct 16 13:31:17 2006 From: lee at novomail.net (Lee Passey) Date: Mon Oct 16 13:29:38 2006 Subject: [gutvol-d] any more questions, lee? In-Reply-To: <3b3.9849ea8.32652b21@aol.com> References: <3b3.9849ea8.32652b21@aol.com> Message-ID: <4533EC15.5090901@novomail.net> Bowerbird@aol.com wrote: > lee said: >> "Said the pieman to Simple Simon, >> 'Show me first your penny.'" > > so lee, do you have any more questions > about simple-simon authoring-tools? > > if you do, i'll be happy to answer 'em... Any /more/ questions? If there was a question implied in my post, it didn't get answered, and frankly I don't have any other questions for you. The only question I have for you is where can I obtain a non-vaporous copy of your authoring tool, so I can run it through its paces. -- Nothing of significance below this line. From joey at joeysmith.com Mon Oct 16 13:39:42 2006 From: joey at joeysmith.com (joey) Date: Mon Oct 16 13:46:12 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: References: Message-ID: <20061016203942.GC29634@joeysmith.com> On Mon, Oct 16, 2006 at 03:12:06PM -0400, Bowerbird@aol.com wrote: > if you guys had anything _substantive_ to say, you certainly would, > and i would simply say "thanks" and add it to my list of structures... > > as it is, you've now had _years_ to think about it and scour books to > try and find something that falls outside my list, and you've got zilch. > > and hey, here's a quick piece of advice you might consider: > when you've got _zilch_, the best thing to do is stay silent... Again: "Argument from ignorance". Just because I choose not to solve your problems for you doesn't mean you don't have problems. From Bowerbird at aol.com Mon Oct 16 14:48:09 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 14:48:20 2006 Subject: [gutvol-d] any more questions, lee? Message-ID: lee said: > The only question I have for you is > where can I obtain a non-vaporous copy > of your authoring tool, so I can run it through its paces. ok, that's a good question. you can't have a copy. not now, anyway, and probably not for another year or so. so i suggest you go off and reinvent the wheel yourself... :+) because, lee, that's what i really want you to do: waste your time... but i'm not really sure why you think you need a special authoring tool to create a .zml file, since any ordinary text-editor would do just fine... remember, david moynihan converted the entire p.g. _library_ (and more) into a stunning array of e-book formats. it's really not that difficult to do... besides, i have pointed you to the markdown dingus. if you tell me just exactly why markdown won't serve your purpose, i can help you. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/1970225c/attachment.html From Bowerbird at aol.com Mon Oct 16 14:51:52 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 14:52:00 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <522.6631b46.326558f8@aol.com> joey said: > Just because I choose not to solve your problems for you > doesn't mean you don't have problems. i'm not asking you to "solve my problems", joey, i'm saying that unless you tell me what they _are_, i'm just gonna have to assume that i don't have any. besides the ones i already know about... :+) but if you want other people to know i have problems, my guess is that you're gonna have to tell those people what my problems are. i sure don't have any difficulties telling people about the problems that _i_ see elsewhere. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/d820eb29/attachment.html From Bowerbird at aol.com Mon Oct 16 14:59:11 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 14:59:27 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <269.11b25bc6.32655aaf@aol.com> carlo said: > A lot of contemporary math research "rediscovers" > XIXth and early XXth century works that have been forgot > for 70-100 years or more, and restarts from them. and they will be able to "restart" from an illustration of the equation just as easily as they've been able to "restart" from that same equation in a paper-book... would it be nice if someone had first done the work of making that equation (and all the other equations in that book, and in every other book) _importable_ into today's equation software? well, sure, i'd guess, but i would hope that today's mathematicians don't _expect_ us to do that for them as a matter of course. and i hope the architects and engineers don't expect us to make all the diagrams in all the books cad/cam-ready. and musicians shouldn't expect us to input all the music, just so it's immediately available to them without any work. i mean, _sure_, if there are volunteers who _want_ to do this stuff, then i'm all in favor of it, and i can support it _just_as_well_as_other_systems,_thank_you_very_much_. but please, kids, don't hold up your rate of ~2,000 books digitized per year as supporting your workflow or methods. > Sure we do. We use TeX (or pseudo-TeX fragments). and that's why that's what i'll probably do as well, when the time comes that i feel that it's necessary, because that's my modus operandi, to utilize the existing conventions, to best leverage current work. but for now, i'm not at all worried about this "problem". *** now, i have posted how i will handle these issues, so let's all just move on to something more productive... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/4452d45e/attachment-0001.html From Bowerbird at aol.com Mon Oct 16 16:05:48 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 16:05:55 2006 Subject: [gutvol-d] just for the record Message-ID: for the record, here's the list from the "one-laptop-per-child" people... > In simplest terms, a list of our markup requirements is as follows: > 1. Bold, italic, and monospace text > 2. Ordered and unordered lists, nested arbitrarily > 3. Four levels of headings > 4. Blockquotes > 5. Internal and external links > 6. Custom date formatting > 7. References (reference link style for external URLs) > 8. Simple tables > 9. Horizontal rules > 10. Full extensibility with parser hooks looks pretty familiar... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/10991950/attachment.html From cannona at fireantproductions.com Mon Oct 16 16:50:10 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Mon Oct 16 16:50:16 2006 Subject: [gutvol-d] oh geez, part 3 References: Message-ID: <005901c6f17d$d2c6d590$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've read several important statements in the past day or so which I feel serve to highlight one of the major issues in this discussion. Bowerbird wrote on ZML: "at any rate, i am sure that math people have developed ways to share their work with each other via plain-text e-mail, and - -- if the need arises -- i will hear from them how they do that, and incorporate those conventions into zen markup language." - From another message, same author: "i mean, _sure_, if there are volunteers who _want_ to do this stuff, then i'm all in favor of it, and i can support it _just_as_well_as_other_systems,_thank_you_very_much_. ... but for now, i'm not at all worried about this 'problem'." On the support for complex tables in ZML, same author: "my own system will eventually be able to handle quite complex tables, when i find the need to develop it that far. and if you'd like some proof, then hand me a list of 100 e-texts that use tables, and i will tackle them first when the time for 'attacking tables' comes up big on my agenda." - From Josh when asked what PGTEI can handle: "Everything you've mentioned AND everything David mentioned (there might be something in the Early English stuff he mentioned it can't do, though at least one of them has been done in PGTEI). ... I'm too busy actually DOING something. You know, like posting close to 100 books in TEI format to the PG archives. (And before anyone calls me on it, *I* haven't posted that many; I'm including other people's efforts on TEI in that "close to 100". There have been 3 other people that have posted books in TEI format that I know of.)" A quick look at the PGTEI documentation confirms that pgtei does in fact have support for embedding LaTex equations. So, we've got two competing systems. One of them has been used to publish several PG etexts, and seems to support complex tables, math, and various other formatting brought up today. The other has not been used to produce nearly so many texts and does not yet support math, complex tables, and a few other things. In addition, the maintainer of the latter system apparently does not feel that math and complex tables are a high enough priority yet, and wishes to be shown x number of examples of the need for such before he will add them to his format. This does not cover all of the arguments for and against each system, not by a long shot. However, from the above, it is quite clear, at least to me, which system it would be most beneficial for Project Gutenberg to adopt. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFNBq1I7J99hVZuJcRAjqYAKDotAgDJZpz7ApklVXQZCqbsQ0u+gCgzJ1O tJbeBThERLdgwxYiB+y5PoY= =/Gn6 -----END PGP SIGNATURE----- From Bowerbird at aol.com Mon Oct 16 17:03:54 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 16 17:04:14 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: <277.123dcdf7.326577ea@aol.com> thanks for your "summary", aaron... to repeat, i don't care one whit what system project gutenberg "adopts". indeed, i'd like for josh to spend a whole _boatload_ of time making .tei versions of all the books he does. and as soon as josh is ready to have me take another close look at the .html and the .pdfs created by his .tei, i'll be happy to do that too, using the same criteria they failed at last time. -bowerbird p.s. i really should inform you that it is terribly simple for me to convert p.g. .txt files to .zml, so it's not wise of you to do "comparison counts" between .zml and .tei, because that will bite you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/d2db64f1/attachment.html From cannona at fireantproductions.com Mon Oct 16 17:13:59 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Mon Oct 16 17:15:24 2006 Subject: [gutvol-d] oh geez, part 3 References: <269.11b25bc6.32655aaf@aol.com> Message-ID: <008101c6f181$55d42700$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bowerbird wrote: > carlo said: >> A lot of contemporary math research "rediscovers" >> XIXth and early XXth century works that have been forgot >> for 70-100 years or more, and restarts from them. > > and they will be able to "restart" from an illustration > of the equation just as easily as they've been able to > "restart" from that same equation in a paper-book... Just as someone can read a scanned image of a page just as easily as they can read that page in a paper book, so why OCR and proof read at all? > > would it be nice if someone had first done the work > of making that equation (and all the other equations > in that book, and in every other book) _importable_ > into today's equation software? well, sure, i'd guess, > but i would hope that today's mathematicians don't > _expect_ us to do that for them as a matter of course. It depends on your definition of import, but yes, most equations from most books can be imported, in one form or another into software. Also, if PG is going to add a math text to the archive, then it would make sense to have a standard format that will support it. Mathematics can be just as much a part of a book as tables can, or any other type of unusual formatting. By your logic, one could make the argument: "would it be nice if someone had first done the work of making that table (and all the other tables in that book, and in every other book) _importable_ into today's table parsing software? well, sure, i'd guess, but i would hope that today's table readers don't _expect_ us to do that for them as a matter of course." In fact, if what you say is true, then, as I mentioned above, it could be argued that every page should be left as an image, because readers shouldn't expect us to do all that work of ocring, proofing and formatting text for them. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFNCCaI7J99hVZuJcRAl+cAJ9GzFpBDAXhRN+Dhuyq5m1cTIbvMQCfXdAP 51pH9mJ0ChLFElkg8qrl7wQ= =nuJq -----END PGP SIGNATURE----- From traverso at dm.unipi.it Tue Oct 17 05:27:20 2006 From: traverso at dm.unipi.it (Carlo Traverso) Date: Tue Oct 17 05:25:24 2006 Subject: [gutvol-d] oh geez, part 3 In-Reply-To: <269.11b25bc6.32655aaf@aol.com> (Bowerbird@aol.com) References: <269.11b25bc6.32655aaf@aol.com> Message-ID: <200610171227.k9HCRKZ03283@pico.dm.unipi.it> >>>>> "Bowerbird" == Bowerbird writes: Bowerbird> --===============1910445001== Content-Type: Bowerbird> multipart/alternative; Bowerbird> boundary="part1_269.11b25bc6.32655aaf_boundary" Bowerbird> --part1_269.11b25bc6.32655aaf_boundary Content-Type: Bowerbird> text/plain; charset="US-ASCII" Bowerbird> Content-Transfer-Encoding: 7bit Bowerbird> carlo said: >> A lot of contemporary math research "rediscovers" XIXth and >> early XXth century works that have been forgot for 70-100 years >> or more, and restarts from them. Bowerbird> and they will be able to "restart" from an illustration Bowerbird> of the equation just as easily as they've been able to Bowerbird> "restart" from that same equation in a paper-book... Bowerbird> would it be nice if someone had first done the work of Bowerbird> making that equation (and all the other equations in Bowerbird> that book, and in every other book) _importable_ into Bowerbird> today's equation software? well, sure, i'd guess, but Bowerbird> i would hope that today's mathematicians don't _expect_ Bowerbird> us to do that for them as a matter of course. That's exactly what mathematicians (at least, some) are doing. And surely are are not expecting to have it done without a semantical markup. OpenMath, MathML, Texmacs, Doyen are some of the names. They are mostly either TeX based, or can import (carefully written) LaTeX. Carlo From j.hagerson at comcast.net Tue Oct 17 06:49:39 2006 From: j.hagerson at comcast.net (John Hagerson) Date: Tue Oct 17 06:59:59 2006 Subject: [gutvol-d] Seeking Volunteer who produced e-book 12254 Message-ID: <00a001c6f1f3$1c6e3790$1f12fea9@sarek> Project Gutenberg e-book 12254 is titled Illustrated History of Furniture: From the Earliest to the Present Time, the author is Frederick Litchfield. I have a request from Brazil to obtain a high-resolution scan of illustration 48, if it could be made available. Thank you for your assistance. John Hagerson From sam.bretheim at gmail.com Tue Oct 17 09:37:41 2006 From: sam.bretheim at gmail.com (Sam Bretheim) Date: Tue Oct 17 09:39:26 2006 Subject: [gutvol-d] just for the record In-Reply-To: References: Message-ID: <453506D5.1090801@gmail.com> Several important issues were lost in yesterday's discussion of the relative merits of the PGTEI and ZML production schemes. First, I haven't seen any suggestion here that PG or DP as a whole should adopt a ZML-based workflow, so there's little point to agonizing about whether someone working semi-independently is most comfortable using ZML. As long as the result is good-quality XHTML and text, do we care whether the post-processor or independent submitter used ZML, TEI, Dreamweaver, Word, groff, vi, ed, or a Ouija board? The point is to get the books proofread and distributed. Second, one of the most important reasons to use "heavyweight" markup languages is that they make information about the meaning of a document, rather than just its surface structure, available to automatic search, browsing, and analysis tools. For example, extracting bibliographic citations from the presentational information available in PDF/PostScript/HTML/text documents is a serious chore that the teams at CiteSeer, Google Scholar, IEEE, and numerous other organizations have spent many person-years on; the results are still terribly error-ridden, and the programs that produce them are full of shady heuristic guess-work. Extracting references (or virtually any other meaning) from natural body text, without the formalism of an academic paper's References section, is a hard unsolved AI problem. However, if the authors of a document supply proper semantic markup, such as TEI or BibTeX, getting and analyzing citations from the document is trivial. If reader software understands the citations and allusions in a document, it can add clickable links to them, thus making following the references a natural and near-instant action. But links are just the easiest thing we can do with software that understands references. It's nearly as easy to add Xanadu-style bidirectional links, so that when reading a document we can see which other documents refer to it; by counting those references, we can guess the importance of a document. These links can be much more powerful than simple HTML-style links, because they can encode information about the type of reference: Is it a formal citation, an allusion, a quotation, a paraphrase, a plagiarization, ...? Is this article a review of that movie? Is the review positive or negative? References are just one fairly simple example of what semantic and ontological markup makes possible; properly marked-up equations, tables, diagrams, and sheet music can be subjected to all sorts of processing that make them even more valuable to society than they were in their original contexts. Considering how time-intensive properly proofreading a book is, adding a bit of semantic markup at the post-proofing stage is quick and easy. Third, the visual appeal and simplicity of the source code of an XML document is entirely beside the point, because hand-editing XML source and hoping it validates is a hackish trick of last resort. Many people do it even now because there are no great affordable XML editors, but ultimately it's not the right way to produce a document. Finally, for those who insist on hand-editing the source of documents, markup languages like XML in which the tag and entity syntax is simple and well-defined have natural advantages over homegrown approaches (MediaWiki, ZML, DP-ish) for which a standard choice of parser substitutes for a good, stable, standardized specification document. One of Larry Wall's principles of language design is that "easy things should be easy, and hard things should be possible". The fact that TEI could do better at easy things is completely outweighed by the fact that the homemade solutions make the hard things impossible. If we were to suggest that all new users learn something like ZML (or, by analogy, a similarly simplistic piece of software like iMovie or FrontPage), at a certain point most of them would grow beyond the capabilities of the software, and knowing how to use it would give them no insight whatsoever into producing more complex results with more sophisticated software; they would need to relearn everything, starting from square one. Ease of importing of text with conflicting encodings is one advantage: natural-language punctuation aside, in XML there are precisely three metacharacters to worry about; if you try to paste a mathematical or technical expression into a homemade language, you have the same nasty metacharacter-escaping problems dealt with by anyone who's tried to alter C source files using a bash script assisted by snippets of sed and awk. If the language changes and makes more characters special, porting a document to a new version of the language becomes terribly difficult. But what it really comes down to is extensibility: it doesn't take long to run through the 32 non-alphanumeric printable characters in ASCII and the few ways you can combine them with the 3 acceptable whitespace characters, particularly if you let users of every world language punctuate sentences in the fascinating array of ways that they invent. If you make a genuine attempt to add all of the semantic and presentational functionality that people need to a homemade markup language, at a certain point you will need to add increasingly long and complex expressions to deal with things that you didn't think of when you were designing the language, and while the source code will no longer be any easier to read or write than XML, you will still lack the large amount of careful, well-reasoned work that's gone into adding those features to TEI already, and you'll have nothing like the power of XML namespaces. The quest for "markup without markup" reminds me of Flannery O'Connor's "Holy Church of Christ Without Christ": ultimately not an honest or achievable aim. A markup language is a markup language, regardless of whether its syntax uses tags between angle brackets or punctuation characters with carefully chosen whitespace. Esoteric programming languages like Whitespace demonstrate this fact well. From Bowerbird at aol.com Tue Oct 17 11:41:55 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 17 11:42:11 2006 Subject: [gutvol-d] just for the record Message-ID: wow, a thoughtful message, what a breath of fresh air, a nice change from all the trolls, thank you much, sam. i'll respond a bit later... :+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/2cce2f1e/attachment-0001.html From Bowerbird at aol.com Tue Oct 17 11:54:56 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 17 11:55:05 2006 Subject: [gutvol-d] oh geez, part 3 Message-ID: carlo said: > That's exactly what mathematicians (at least, some) are doing. and i would expect that, as time goes by, the burgeoning drive towards simplicity -- which is happening all over cyberspace -- will take root in that work too. and honestly, until that happens, there's no need for me to jump in prematurely with my pick on it. i'll let y'all get things sorted out first, and then cherrypick the best. since google will be scanning 10-40 million books, there are plenty of non-math texts for me to work on before i do the math books... > OpenMath, MathML, Texmacs, Doyen are some of the names. They > are mostly either TeX based, or can import (carefully written) LaTeX. this only makes sense, since tex is so well-established in that world. it's hard for me as a non-mathematician to know where the line is between tex as an equation editor and tex as a typesetting tool... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/b0a9e096/attachment.html From Bowerbird at aol.com Tue Oct 17 15:09:43 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 17 15:09:54 2006 Subject: [gutvol-d] just for the record Message-ID: sam said: > First, I haven't seen any suggestion here that PG or DP as a whole > should adopt a ZML-based workflow, so there's little point to agonizing > about whether someone working semi-independently is most comfortable > using ZML that's basically right. the situation is a little more complicated than that, because i'm saying eventually there will indeed be a policy shift to zen markup language, or something very similar to it, because it will be seen to be superior... but based on a whole bunch of experience -- that flurry yesterday?, it used to be like that here _every_day_, i kid you not, it was stupid -- i'm convinced the people here now won't be able to see it very soon. and even once they can see it, their pride will make them deny it... but their system will collapse in on itself, assuming it even gets built, and the people who clean up the mess will rebuild on simplicity instead. so my aim at present is to mold the p.g. library as my own z.m.l. library; my ability to then maintain and extend it will serve as a model for others. > As long as the result is good-quality XHTML and text, do we care > whether the post-processor or independent submitter used > ZML, TEI, Dreamweaver, Word, groff, vi, ed, or a Ouija board?? > The point is to get the books proofread and distributed. just a couple of quick points about this... first, you've begged the question by positing "good-quality xhtml" as the objective. _my_ actual objective is "highest-quality e-books." further, the "point" is not just to get books proofread and distributed, it's to create a library which (a) is maintained with a minimum of effort, and (b) gives users maximal power to use the books in a variety of ways. i don't think you'll disagree with either of those elaborations. and if you do, i'm pretty sure we can just agree to disagree... > Second, one of the most important reasons to use "heavyweight" > markup languages is that they make information about the meaning > of a document, rather than just its surface structure, available to > automatic search, browsing, and analysis tools. i know that's one of the things that is _promised_ by heavy markup. but i don't see it being delivered in a meaningful way, not at present. and i think there will be serious problems with realizing the promise. in fact, it is my opinion that "invisible" markup has a _better_ chance of delivering on this promise, as it's better at "staying out of the way", _plus_ it entails fewer up-front costs and thus is on a faster track to a better cost-benefit ratio. the only way that heavy markup can compete is if it can be applied _programmatically_. but then, of course, it would be possible to bake the markup-application code into the end-analysis program, and eliminate the markup middleman entirely. (do you grok this? because it's really the most important argument against markup.) > For example, extracting bibliographic citations from the presentational > information available in PDF/PostScript/HTML/text documents > is a serious chore that the teams at CiteSeer, Google Scholar, IEEE, > and numerous other organizations have spent many person-years on; > the results are still terribly error-ridden, and the programs that > produce them are full of shady heuristic guess-work. i understand exactly what you're talking about. one of the "hidden secrets" about the philosophy of "invisible markup" is that we have to learn how to make that information _transparent_... in other words, we need to _format_ our documents so that the info that we might want to extract from them becomes _obvious_, both to computers and to humans. once you understand that this is the key, you'll realize that it's really not that difficult. indeed, it's not hard at all. contrast this with the heavy-markup approach, where modus operandi is to _label_everything_. that method will "work", yes it certainly _will_, since ambiguity is reduced to a minimum if you've labeled everything, but the labeling process is extremely costly. if we consider all the text that we have out in the world, the idea of labeling it all is preposterous. even if it _was_ possible, it would quickly become absolutely unworkable, since the labels would soon be tripping over themselves, and the actual _content_ would quickly become completely obscured by all the labeling. and yes, it's gonna take a little while before mankind realizes all of this. and we'll be misled by mid-stream "successes" where markup "works" before it collapses on itself. but it won't take long for confusion to reign. and -- perhaps more importantly -- since the _costs_ have to be paid up-front, before any of the benefits even start to accrue, the probability that people will be suckered into doing markup is _greatly_ diminished. meanwhile, a movement toward _simplicity_ has already begun in earnest. ordinary people don't want to be bothered with doing markup, so we have begun inventing ways to have to applied automatically, in the form of z.m.l. and markdown, textile, wiki-formatting, plus a myriad of others to follow... people have also become allergic to "feature creep", so now the innovators (like, say, 37signals) have made "simplicity" the key _feature_ in their apps... and there's no turning back of this momentum. who needs complexity? the technoids who've been trying to shove complexity down our throats -- whether because they personally prefer _difficulty_, or because they see our _helplessness_ as the key to their economic future, or _both_ -- are going to fail. i repeat: they are going to fail. it's really that simple... > Extracting references (or virtually any other meaning) > from natural body text, without the formalism of an > academic paper's References section, is a hard unsolved AI problem. right. and that problem will probably _never_ be solvable. but did you notice that, _with_ the "formalism" (as you have put it) of a "references" section, the problem reduces to an utterly easy task? now, once you understand that what you call "formalism" is what i call "transparency", then you'll have a good understanding of what i mean. and material in that "references" section does _not_ have to be labeled with heavy x.m.l. markup for it to be effectively and efficiently parsed... a good example of this is "zotero", dan cohen's new firefox plug-in. it "senses" when a web-page has cataloging-type information on it, and will automatically capture that data and save it in your database, so you can search it later, bundle it into your own papers, and so on. > However, if the authors of a document supply proper semantic markup, > such as TEI or BibTeX, getting and analyzing citations from the document > is trivial. the focus of heavy-markup advocates is always on the _benefit_ of the "trivial" retrieval, with the substantial _cost_ of the initial encoding being waved off as if it happens by magic. unfortunately, this makes for great _snowjobs_by_salesmen_. fortunately, once people have been suckered by this "free lunch" promise several times, they _will_ wise up eventually. once they find out that markup doesn't deliver what it had promised, the scenario will be "you need more markup, and better markup, and -- oh, by the way -- that more better markup will also cost you extra", and the snake-oil salesman will be booted out the door. end of story. > If reader software understands the citations and allusions > in a document, it can add clickable links to them, thus making > following the references a natural and near-instant action.? right. and this is what we expect from an electronic document. but what we really want is to have those links made automatically, instead of having to code them all manually. that's too much work. > But links are just the easiest thing we can do with software > that understands references.? It's nearly as easy to add > Xanadu-style bidirectional links, so that when reading > a document we can see which other documents refer to it; > by counting those references, we can guess the importance > of a document.? These links can be much more powerful > than simple HTML-style links, because they can encode > information about the type of reference: Is it a formal citation, > an allusion, a quotation, a paraphrase, a plagiarization, ...? > Is this article a review of that movie? Is the review positive or negative? again, cool stuff to have. but a tremendously expensive pain to code, if done manually. so find a way to apply the markup programmatically. and then, once you've found the way to apply markup programmatically, just put that code into the program that _your_system_ would eventually have _interpreting_ that markup. you'll find that you have saved yourself enough time and energy by not having to write the markup-interpretation routines that your effort in programmatic-understanding pays for itself... and this is the "secret sauce" of my approach, sam, that we put intelligence into our _programs_ instead of into our _markup_, because the intelligence in _programs_ can be applied to new content without any additional work, whereas with a markup approach that new content must be marked up first. even if markup establishes a toehold, this ongoing expense will kill it off... > properly marked-up equations, tables, diagrams, and sheet music > can be subjected to all sorts of processing that make them > even more valuable to society than they were in their original contexts. the question as to whether they are "more valuable" needs to include an accurate assessment of the cost of applying that "proper" markup. there might well be more benefits after application of "proper markup", but if the _costs_ of applying that markup outweigh the added benefits, it is unwise to go down that path. you might not _like_ that conclusion, but from the entirely reasonable perspective of cost-benefit analysis, sam, it's simply not a conclusion that you can avoid. sorry about that... > Considering how time-intensive properly proofreading a book is, > adding a bit of semantic markup at the post-proofing stage is > quick and easy. but if we can get the same benefits _without_ that markup, then there's simply no good reason to apply it, is there? even if we get _almost_ the same benefits, if the application of markup is expensive (as it will be), cost-benefit says don't-do. > Third, the visual appeal and simplicity of the source code of an XML > document is entirely beside the point, because hand-editing XML source > and hoping it validates is a hackish trick of last resort.? Many people > do it even now because there are no great affordable XML editors, but > ultimately it's not the right way to produce a document. at least you are honest about the stupidity of editing x.m.l. source, and the fact that there are no great affordable x.m.l. editors. kudos! anyway, as i said at the top, the cost of _maintaining_ a cyberspace library is one of the most important considerations that we need to keep in mind. and it's just one more arena where will z.m.l. will shine... > at a certain point most of them would grow beyond the capabilities > of the software, and knowing how to use it would give them no insight > whatsoever into producing more complex results with more sophisticated > software; they would need to relearn everything, starting from square one. you've made the natural assumption that a "lightweight" markup will have a capability-ceiling that is deficient when compared to a "heavyweight" one. i can't say it with definite authority quite yet, but i am highly suspect that this "natural" assumption will prove to be greatly misleading in the case of z.m.l. i can say with much confidence that for 95% of the project gutenberg e-texts, there will be _no_ difference in performance capability between z.m.l. and t.e.i. and the number might well jump to 97%, or even 99%, or even 99.5%. again, equivalent benefits with much lower costs is a brain-dead decision... > in XML there are precisely three metacharacters to worry about metacharacters are a pain. so i've tried to think outside that box. we'll see in the long term how well i have succeeded. > If you make a genuine attempt to add all of the semantic > and presentational functionality that people need to a > homemade markup language, at a certain point you will > need to add increasingly long and complex expressions > to deal with things that you didn't think of when you were > designing the language, and while the source code will no > longer be any easier to read or write than XML, you will still > lack the large amount of careful, well-reasoned work that's > gone into adding those features to TEI already, and you'll > have nothing like the power of XML namespaces. that's a rather bleak if:then statement you've laid out, sam, so let's see if i can short-circuit it at the start, so as not to have to worry about it, ok? do you see any features common to books that i've left out of my test-suite? if so, do please let me know... > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html > http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml > The quest for "markup without markup" reminds me of > Flannery O'Connor's "Holy Church of Christ Without Christ": > ultimately not an honest or achievable aim.? whether it is "achievable" is an open question. since i'm the one putting in the work, though, i assume you don't mind how i spend my time. as to whether this is an "honest" aim, well... i suppose all the technoids who were hoping to make a cushy living from x.m.l. consulting will feel that i've tricked them "unfairly", but i see myself as saving the human race from a big bunch of needless and costly complexity. > A markup language is a markup language well, yes, it is... > A markup language is a markup language, > regardless of whether its syntax uses tags > between angle brackets or punctuation > characters with carefully chosen whitespace.? ...but some markup languages are more beautiful than others, and -- at least in my humble opinion -- the invisible ones are the most beautiful of all... :+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/8df7f350/attachment-0001.html From Bowerbird at aol.com Wed Oct 18 04:11:03 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 04:11:10 2006 Subject: [gutvol-d] since lee doesn't have any more questions Message-ID: ok, lee, so as i said the other day, i want you to waste your time reinventing the wheel, by writing a program i've already written (which you seem to be fond of insinuating is mere vapor)... the thing is, i want you to hurry up and write it, so we can move on to the next step after that. it's 2006 already, and time is a-wasting. to that end, i'm willing to advise you in the writing of your version. we can make it open-source -- i'll direct you, and you'll program. so each day, i'll give you a little assignment for a routine to write -- i'll even give you the pseudo-code for it -- and then when you come back with the routine finished, we'll go on to the next one... if you need some additional assistance after the pseudo-code, i'll provide you some source written in basic (easy to understand). if you still need more help after that, i'll give you the code in perl. of course, you'll always get the correct answer, to verify your work. given direction, examples, and answers, i'm sure you will succeed. you -- of course -- can write the program in any language you like. (and you can make it a web-based program, offline application, or both, whichever you prefer and are capable of, makes no difference, as it is the discipline of writing the code that will be your main gift.) and the lurkers out there can download it from wherever you put it, and run it on their machines, to beta-test it, and give their feedback to us on any bugs. hey, it'll be a nice little group project, lots of fun... within just a few weeks time, we'll build from the very beginning up to a solid program that acts capably, and we'll all be happy and amazed... (time-wise, outside, even slow/steady progress will finish in 6 weeks.) at the same time, if you like, we can also work from the end to the start, by reverse-engineering some of the e-books i've already placed online. *** for our content, we'll use a book with which you might be familiar... > http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml so this is your input file, the .zml "master version" that generates others. *** here, let me repeat that url for you, because it's important... > http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml whether you input the text into your program from the web each time, or from a saved-file version on the local machine, is (again) up to you. and yes, this _is_ "my antonia", digitized by jose menendez, jon noring, and a flock of others. under my direction, you'll write a program that will turn this nifty .zml version of the book into some solid .html files. then we can convert that .html to a wide variety of our e-book formats. i will also show you how to write routines to get a nice .pdf of the text. all from a measly "lightweight" file in z.m.l. -- "zero markup language", the "virtually invisible" markup that's 2 steps more advanced than x.m.l. and before you know it, you'll have the wondrous program that you seek. if you need to support any more formats after that, we can talk about it. but for the most part, if i can't convert to it from .html, i ain't interested. *** it'll be a very simple assignment you get every day, so it won't take much work to get it done. and you'll see regular progress, so it'll be rewarding. and if anyone else wants to jump in, please feel free! it's a group project! a nice group hug, which we all need after the recent gutvol-d antipathy... and hey, wouldn't it be _neat_ to have this program being simultaneously developed in multiple languages, such as perl and python, php and ruby! let's have some flash, experiment with ajax, play with abuncha cool hacks. 87 different tools ordinary people can use to convert our electronic-books. just try and tell me that wouldn't be cool; go ahead fool, try to persuade me. and quick, someone tell david rothman we'll solve his tower of babel with universal translation tools that let every book be understood by everyone. heck, to honor douglas adams, we might even call it "babelfish"... "we don't need no education... teachers leave those kids alone..." *** so here we go! today's assignment: 1. write the shell of a program that reads the text-file and then writes it out as a simple .html web-page, formatted with [pre]. (note: in these assignments, i will use square brackets instead of the traditional .html angle-brackets, to eliminate any confusion.) that's right, just a straight read and then a straight write. easy. nothin' fancy, just "preformatted". read input and write output. pseudo-code: 1a. read file. 1b. prepend .html header-info. 1c. display as web-page. *** extra-credit assignment: 1x. lines with double-curly-braces give the scan-filename and the running head for each page; since some readers will want to eliminate these to rejoin the text's normal flow, write a routine tagging each line (a) followed by a line containing curly-braces, (b) or _is_ a line that contains curly-braces, (c) or is preceded by a line containing curly-braces. in other words, you will eliminate each line with curly-braces, _and_ each line above it and below it. all the untagged remaining lines -- the book in its normal flow -- will be written to a new file, for the user's uninterrupted enjoyment. pseudo-code: 1xa. read the lines of the file into an array. 1xb. step through array marking appropriate lines for deletion. 1xc. write all unmarked lines into a new file. like i said, all of the routines will be simple. they won't all be as simple as _these_ routines -- which are _very_ elementary, we're jus' startin' slow, don't be offended they are so easy -- but we'll still have lots of fun, yes we will... :+) -bowerbird p.s. if you want to purse the reverse-engineering angle, the .html files that were the end-result of my program are available at the base url which you can determine from the cover: > http://www.greatamericannovel.com/myant/myantc001.html and proceeding through the forward-matter: > http://www.greatamericannovel.com/myant/myantf001.html and then the pages themselves: > http://www.greatamericannovel.com/myant/myantp001.html page 123, for instance, can be seen at: > http://www.greatamericannovel.com/myant/myantp123.html if you can write a program _in_one_week_ that outputs the series of .html files represented, you will receive high honors in this class, and an automatic promotion to the advanced version of this school. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/01f17dad/attachment.html From Bowerbird at aol.com Wed Oct 18 04:30:40 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 04:30:55 2006 Subject: [gutvol-d] re: since lee doesn't have any more questions Message-ID: i said: > "we don't need no education... teachers leave those kids alone..." yes, of course i know it's "them kids", not "those kids", it's just that i was playing with google to do a comparison count of the forms -- > the correct "leave them kids alone" = 18,600, > the incorrect "leave those kids alone" = 11,100 > the incorrect "leave the kids alone" = 10,600 > the unique "leave us kids alone" = 196 > the rarest "leave these kids alone" = 9 ("yay", ask-the-audience gives right answer again, wisdom of crowds) -- and somehow at 4:20 in the morning i posted the wrong version. so sue me... ;+) -bowerbird p.s. speaking of which, google considers youtube (and itself) to be totally immune to any copyright concerns, under the "safe harbor" shield, because they immediately take down once they get a notice, which is all you have to do to rescue yourself from a suit under dmca. kinda funny how easy it is to make yourself immune, isn't it?, especially since it doesn't really matter _how_ dirty you were right up to that point. youtube, for instance, is dirty as can be. and yet washed free immediately. very very interesting... "hey teacher leave" = 41,400 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/4300fd04/attachment.html From marcello at perathoner.de Wed Oct 18 05:02:22 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Oct 18 05:02:26 2006 Subject: [gutvol-d] just for the record In-Reply-To: References: Message-ID: <453617CE.9000605@perathoner.de> Bowerbird@aol.com wrote: > _my_ actual objective is "highest-quality e-books." And after 3 1/2 years of development your _status_ is: "This page is not Valid HTML 4.01 Transitional!" http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp001.html -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Wed Oct 18 10:10:51 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 10:10:59 2006 Subject: [gutvol-d] just for the record Message-ID: <304.1059097e.3267ba1b@aol.com> marcello said: > And after 3 1/2 years of development your _status_ is: > "This page is not Valid HTML 4.01 Transitional!" oh no! i've not been validated! my life is so shallow! :+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/b1b8bf7e/attachment.html From Bowerbird at aol.com Wed Oct 18 10:34:47 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 10:34:58 2006 Subject: [gutvol-d] re: my shallow life Message-ID: <235.10ff3d00.3267bfb7@aol.com> distraught, i sought out the source of my invalidity: > Error Line 28 column 16: > an attribute value must be a literal unless it contains only name characters. > [[1]]
oh my word. how could i ever right this terrible wrong? existence is such a sad tragedy, is it not? alas, i despair. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/0e91de23/attachment.html From Bowerbird at aol.com Wed Oct 18 11:16:25 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 11:16:32 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: rescued from the desert of invalidity and refreshed to face another day, i remind myself of the old man in the monty python movie who insists "i'm not dead yet", and who is then, of course, smashed over the head... http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan tp001.html -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/c1bd0214/attachment-0001.html From hyphen at hyphenologist.co.uk Wed Oct 18 11:21:08 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Wed Oct 18 11:21:25 2006 Subject: [gutvol-d] re: my shallow life In-Reply-To: <235.10ff3d00.3267bfb7@aol.com> References: <235.10ff3d00.3267bfb7@aol.com> Message-ID: On Wed, 18 Oct 2006 13:34:47 EDT, Bowerbird@aol.com wrote: |distraught, i sought out the source of my invalidity: |> Error Line 28 column 16: |> an attribute value must be a literal unless it contains only name |characters. |> [[1]]
| |oh my word. how could i ever right this terrible wrong? |existence is such a sad tragedy, is it not? alas, i despair. After chasing down an html problem intermittently for a few days I hate the validation suite as well, it does not give the definitive explanation of what was wrong as one would expect of modern software. Just everlasting complaints about stuff downstream/after of the real problem. ************************* ***HATE HATE HATE HATE*** ************************* ******************* ***CRAP SOFTWARE*** ******************* -- Dave Fawthrop From desrod at gnu-designs.com Wed Oct 18 11:27:24 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Wed Oct 18 11:32:10 2006 Subject: [gutvol-d] re: my shallow life In-Reply-To: References: <235.10ff3d00.3267bfb7@aol.com> Message-ID: > |> [[1]]
> |oh my word. how could i ever right this terrible wrong? > |existence is such a sad tragedy, is it not? alas, i > |despair. Font tags are being deprecated, unnecessary, evil, wastes bandwidth and should be left to gather dust in the corner.. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From marcello at perathoner.de Wed Oct 18 11:50:26 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Oct 18 11:50:30 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <45367772.3090206@perathoner.de> Bowerbird@aol.com wrote: > rescued from the desert of invalidity and refreshed to face another day, > i remind myself of the old man in the monty python movie who insists > "i'm not dead yet", and who is then, of course, smashed over the head... > > http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan > tp001.html http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html "This page is not Valid HTML 4.01 Transitional!" -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Wed Oct 18 12:06:21 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:06:31 2006 Subject: [gutvol-d] re: my shallow life Message-ID: david said: > Font tags are being deprecated, unnecessary, > evil, wastes bandwidth and should be > left to gather dust in the corner.. it was bad enough when my existence was meaningless, but now my work has been characterized as "evil". lordee! how can i recover from this severe blow to my self-esteem? :+) -bowerbird p.s. i think you'll find some c.s.s. in those .html templates, and yes, since that particular bit of colorization was for the pagenumber, i agree that it should be tagged more clearly. p.p.s. dave, the best way to think about the validator is as an idiot-savant, who knows when things are "wrong", but gets tongue-tied and can't really tell you _why_. if you think of the poor thing in this way, you'll be able to forgive its autism. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/0cffc00a/attachment.html From desrod at gnu-designs.com Wed Oct 18 12:12:02 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Wed Oct 18 12:11:15 2006 Subject: [gutvol-d] re: my shallow life In-Reply-To: References: Message-ID: > it was bad enough when my existence was meaningless, but now my work > has been characterized as "evil". lordee! Not your work, your implementation. > how can i recover from this severe blow to my self-esteem? :+) Ignore it and move on to bigger and better things. > p.p.s. dave, the best way to think about the validator is as an > idiot-savant, who knows when things are "wrong", but gets > tongue-tied and can't really tell you _why_. if you think of the > poor thing in this way, you'll be able to forgive its autism. Not sure who "Dave" is, but I'll respond: The validator told you exactly "why" it was wrong, now its up to you to figure out how to fix it. The validator isn't mystical or confusing at all, if you read the errors and warnings it reports. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From Bowerbird at aol.com Wed Oct 18 12:12:08 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:12:13 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: marcello said: > http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan tp002.html there are 428 pages in this book. are you going to point to this same "error" on every one? because, you know, that might actually be kind of _fun_... so -- in the spirit of distributed proofreaders and "a page a day" -- i'll correct page 2 tomorrow, whereupon you can point to page 3, and then the next day i'll correct page 3, so you can point to page 4, and the day after that i'll fix page 4 and you will then point to page 5, and along about the start of 2008, we'll be all finished with this book! then we can start on another one. come on, marcello, it'll be lots of fun! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/e6c42aa8/attachment.html From desrod at gnu-designs.com Wed Oct 18 12:17:36 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Wed Oct 18 12:17:18 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <45367772.3090206@perathoner.de> References: <45367772.3090206@perathoner.de> Message-ID: > http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html > "This page is not Valid HTML 4.01 Transitional!" The error is obvious. If you must use these broken tags, use them as follows: [[2]]
David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From desrod at gnu-designs.com Wed Oct 18 12:22:04 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Wed Oct 18 12:21:19 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: > there are 428 pages in this book. are you going to point to this > same "error" on every one? > so -- in the spirit of distributed proofreaders and "a page a day" > -- i'll correct page 2 tomorrow, whereupon you can point to page 3, Here, let me fix page 2 through 428 for you in one shot: perl -pi.$$ -e \ 's|color=rgb\(222,99,99\)|color="rgb\(222,99,99\)"|g' \ `find . -type f -name '*html'` (all on one line, of course) Now let's get back to doing real work, instead of bickering about doing it. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From Bowerbird at aol.com Wed Oct 18 12:22:41 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:22:50 2006 Subject: [gutvol-d] re: my shallow life Message-ID: david said: > Not sure who "Dave" is, but I'll respond: evidently, you didn't read the post from dave falwthrop; he was the one complaining about the validator output. i've been stumped at times in the past, but i found this quite easy to understand, since i had used the realbasic method of specifying color -- rgb(rrr,ggg,bbb) -- and not the html method. as i'm sure you know quite well, that kind of transference error happens a lot when you jump from one environment into another. fact of life... anyway, david, your taking all of this seriously is putting a real crimp on the humor of my mockery... ;+) and bottom line, all of these .html files will eventually be spit out on demand by a script, which can be changed on a whim, so any particular manifestation of the .html is so transitory as to be unworthy of examination. really. just so's you know... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9f063b99/attachment.html From Bowerbird at aol.com Wed Oct 18 12:25:03 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:25:18 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: david said: > The error is obvious. um, gee, thanks david, thanks a lot, i really appreciate it. -bowerbird p.s. [heavy sigh] ;+) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9f8228d7/attachment.html From Bowerbird at aol.com Wed Oct 18 12:26:48 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:26:53 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: david said: > Here, let me fix page 2 through 428 for you in one shot: you know, somebody always has to ruin the fun... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9012f778/attachment.html From Bowerbird at aol.com Wed Oct 18 12:40:59 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 12:41:09 2006 Subject: [gutvol-d] meanwhile, i'm really excited! Message-ID: meanwhile, i am really excited about this "babelfish" project! of course, lee hasn't responded yet, but i say, i just can't wait! this is gonna be _so_ much fun! it's difficult to _contain_ myself! so even though i'm not _supposed_ to give a hint, via basic code, until the day _after_ an assignment, i simply _must_ jump the gun. so here's realbasic code for an app that loads in the text-file from your hard-drive and displays the text proudly in a scrolling editfield. > dim f as folderitem > f=getfolderitem("myant.zml") > if f<>nil and f.exists then f.openstylededitfield editfield1 that's it, folks! that's all it takes to load a text-file and display it! so we've accomplished the first day's lesson! pretty simple, eh? see, i told you this was gonna be fun! i feel validated already! in fact, that was so easy, let's do the extra-credit one too, ok? > dim tl(-1) as string rem -- tl (for "the lines") is an array > dim x as integer > tl=split(editfield1.text,chr(13)) rem -- read text into array > tl.insert(0,"") rem -- do a little housekeeping for clarity > tl.append("") rem -- a little more housekeeping for clarity > for x=1 to ubound(tl)-1 > if instrb(1,tl(x+1),"{{")>0 then tl.remove(x) rem -- delete preceding > next x > for x=2 to ubound(tl) > if instrb(1,tl(x-1),"{{")>0 then tl.remove(x) rem -- delete following > next x > for x=1 to ubound(tl) > if instrb(1,tl(x),"{{")>0 then tl.remove(x) rem -- delete curly-braces > next x > editfield1.text=join(tl,chr(13)) rem put the new data back in field wow, that was easy too. and we're all done for the day. how about a cold beer? but wait a minute here. the output shows us the assignment was flawed in a certain respect, and will need to be respecified slightly. that's ok, that's one of the good things that code does, it forces you to _be_specific_and_precise_ with specifications -- to be _exact_ -- and that's a very good way to be, an _excellent_ way to be, yes sir... yes indeed, a half-hour of coding can help you hone your thinking better than three-and-a-half months of listserve posts. seriously! so, can you tell me what the flaw was in the original specification? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/983be7c2/attachment-0001.html From marcello at perathoner.de Wed Oct 18 15:02:58 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Oct 18 15:03:03 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <4536A492.80200@perathoner.de> Bowerbird@aol.com wrote: > marcello said: >> > http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan > tp002.html > > there are 428 pages in this book. > > are you going to point to this same "error" on every one? Are you going to fix the error on page 1 and hope the other 427 pages fix themselves? -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Wed Oct 18 15:26:46 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Oct 18 15:26:51 2006 Subject: [gutvol-d] re: my shallow life In-Reply-To: References: Message-ID: <4536AA26.301@perathoner.de> Bowerbird@aol.com wrote: > and bottom line, all of these .html files will eventually be > spit out on demand by a script, which can be changed > on a whim, so any particular manifestation of the .html > is so transitory as to be unworthy of examination. really. The funny thing here is that while you are giving us an endless song and dance about your "highest quality ebooks" > _my_ actual objective is "highest-quality e-books." it is quite evident that you have implemented no QA processes whatsoever or you would have caught so simple an error before posting 428 files containing that error. Bowerbird "uality" ebooks. I'm sure! And don't get me started about your complete lack of WAI accessibility features while your user interface is so ugly that only a blind person might actually want to use it. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Wed Oct 18 18:15:48 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 18:15:55 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: marcello said: > Are you going to fix the error on page 1 > and hope the other 427 pages fix themselves? no! _i_ am gonna fix each page, by hand, one per day, assuming that i receive an error-report from you on it... tomorrow i fix page 2. oh yes, this will be so much fun! > it is quite evident that you have implemented no QA processes whatsoever > or you would have caught so simple an error before posting 428 files > containing that error. "that error" -- as you put it -- means two things, and two things only. 1. the pagenumber is black instead of some other color. 2. the .html doesn't "validate". neither of those things is of the _slightest_ concern to me. sorry. indeed, in fact, if memory serves me correctly, i left "that error" in on purpose, precisely because it caused the page not to validate, and i knew that would be upsetting to you technoid wieners here. and yes, i realize that it's a smallish cruelty of sorts to poke the o.c.d. tendencies of some people here for something so fully insignificant, and i hope people forgive me, as i feel badly about that, i _really_ do... (no i don't. yes i do. no i don't. yes i do. no i don't. yes i do. no i don't.) > And don't get me started about > your complete lack of WAI accessibility features > while your user interface is so ugly that > only a blind person might actually want to use it. nah, the blind people will prefer the plain-text .zml version, since there isn't any crap in there to gunk up screen-readers. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/c96317cc/attachment.html From Bowerbird at aol.com Wed Oct 18 18:22:01 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Oct 18 18:22:11 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: <556.9857302.32682d39@aol.com> dang, i know i'm supposed to wait, but i'm just so excited, i can't stop myself! > http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html page 2 validates! woo-hoo! your move, marcello! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/d084c16b/attachment.html From lee at novomail.net Wed Oct 18 20:15:15 2006 From: lee at novomail.net (Lee Passey) Date: Wed Oct 18 21:03:25 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: <4536EDC3.5060403@novomail.net> Bowerbird@aol.com wrote: > meanwhile, i am really excited about this "babelfish" project! > > of course, lee hasn't responded yet, but i say, i just can't wait! Heavens, please don't wait for me! I'll admit that Mr. Noring's suggestion of creating an authoring tool that is word processor-like and WYSIWYGy, that would permit author's to create e-books in a user-friendly environment, that would save documents in a powerful and useful format, and would convert to multiple existing e-book formats at the click of a button, is a very intriguing programming project. On the other hand, learning your approach to converting ZML to HTML is of virtually no interest to me at all. If, however, you were to create an instance of the aforementioned authoring tool (which is what I thought you were claiming) and it only saved its work in ZML, then maybe I /would/ take the hour or two it would require to create a program to convert a ZML file into something useful. Or if you were to clean up several thousand of the PG files which are not currently available in HTML and make them available in ZML. But currently there is no compelling reason to pay any attention whatsoever to ZML. From hyphen at hyphenologist.co.uk Wed Oct 18 23:39:28 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Wed Oct 18 23:39:43 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: On Wed, 18 Oct 2006 15:40:59 EDT, Bowerbird@aol.com wrote: |meanwhile, i am really excited about this "babelfish" project! | babelfish has been going for years, and I have used it for years. The translations given are absolutely awful. At best they are a first pass for a human translator, or give you a general idea of what the text is about. -- Dave Fawthrop From Bowerbird at aol.com Thu Oct 19 00:35:59 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 00:36:08 2006 Subject: [gutvol-d] meanwhile, i'm really excited! Message-ID: lee said: > I'll admit that Mr. Noring's suggestion of creating an authoring tool > that is word processor-like and WYSIWYGy, that would permit author's i take it you mean "authors" -- the plural -- not the possessive... > I'll admit that Mr. Noring's suggestion of creating an authoring tool > that is word processor-like and WYSIWYGy, that would permit author's > to create e-books in a user-friendly environment, that would > save documents in a powerful and useful format, and would > convert to multiple existing e-book formats at the click of a button, > is a very intriguing programming project. that's what we're doing here, lee! the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is simple enough when the file-format is something as simple as .zml. further, z.m.l. is that "powerful and useful format" that you mention, precisely because it will do what you say you want to do, which is to create files across a variety of e-book formats. because once we morph the .zml file to .pdf and especially (x)html, it's a simple matter of routing that .html to the converter programs. (any e-book format that can't accept .html as input is a non-starter.) > On the other hand, learning your approach to converting ZML > to HTML is of virtually no interest to me at all. gee, lee, you're talking out of both sides of your mouth here. first you say "it's a very intriguing programming project", and then you say it is "of virtually no interest to me at all"... which is it lee? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/fc228f12/attachment.html From schultzk at uni-trier.de Thu Oct 19 02:07:47 2006 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Thu Oct 19 02:07:53 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: Hi There, There are already such tools availible commercially!! It has been around for a long time TeX and LaTeX. Textures is a WYSIWYG system and authouring Tool. LaTeX can be easily converted to pdf, html, xml, docbook, etc. As Bowerbird mentioned in another thread why reinvent the wheel or try to. Just my two Euro cents worth! reagards Keith. P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics, footnotes, layout control, indices, bibliographie, multi-language, right-left, left-right., ASCII, UniCode, pratically platform independent. There are freeware versions, but they are generally not WYSIWYG, thereby having at first a stiff learning curve. Keith. Am 19.10.2006 um 09:35 schrieb Bowerbird@aol.com: > lee said: > > I'll admit that Mr. Noring's suggestion of creating an > authoring tool > > that is word processor-like and WYSIWYGy, that would permit > author's > > i take it you mean "authors" -- the plural -- not the possessive... > > > I'll admit that Mr. Noring's suggestion of creating an > authoring tool > > that is word processor-like and WYSIWYGy, that would permit > author's > > to create e-books in a user-friendly environment, that would > > save documents in a powerful and useful format, and would > > convert to multiple existing e-book formats at the click of a > button, > > is a very intriguing programming project. > > that's what we're doing here, lee! > > the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is > simple enough when the file-format is something as simple as .zml. > > further, z.m.l. is that "powerful and useful format" that you mention, > precisely because it will do what you say you want to do, which is to > create files across a variety of e-book formats. > > because once we morph the .zml file to .pdf and especially (x)html, > it's a simple matter of routing that .html to the converter programs. > (any e-book format that can't accept .html as input is a non-starter.) > > > > On the other hand, learning your approach to converting ZML > > to HTML is of virtually no interest to me at all. > > gee, lee, you're talking out of both sides of your mouth here. > > first you say "it's a very intriguing programming project", > and then you say it is "of virtually no interest to me at all"... > > which is it lee? > > -bowerbird > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/c2172252/attachment-0001.html From marcello at perathoner.de Thu Oct 19 04:45:08 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 04:45:12 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <556.9857302.32682d39@aol.com> References: <556.9857302.32682d39@aol.com> Message-ID: <45376544.5000808@perathoner.de> Bowerbird@aol.com wrote: > page 2 validates! woo-hoo! Wow! only 426 pages to go on the way to the first Bowerbird "uality" ebook. Then you can start fixing the 428 side-by-side view pages: http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp001w.html (I don't know why there are 428 of them, I would have thought 214 were enough. You must know something I don't.) Then you can upgrade to from HTML 4.01 transitional to XHTML 1.0 strict (because OEB standard is based on XHTML). Then you can restructure your XHTML so it does not just "validate" but makes sense semantically. At the same time you can build your CSS stylesheets. Then you can add WAI accessibility features so screen readers actually know what they are reading. Then you'll be about half the way PGTEI is now. I guess, hoping that you'll stop bothering this list until you have fixed your processes, is out of the question? -- Marcello Perathoner webmaster@gutenberg.org From davedoty at hotmail.com Thu Oct 19 04:42:56 2006 From: davedoty at hotmail.com (Dave Doty) Date: Thu Oct 19 04:54:58 2006 Subject: [gutvol-d] meanwhile, i'm really excited! Message-ID: Okay, how did Bowerbird get around my killfile? _________________________________________________________________ All-in-one security and maintenance for your PC.? Get a free 90-day trial! http://www.windowsonecare.com/purchase/trial.aspx?sc_cid=wl_wlmail From marcello at perathoner.de Thu Oct 19 04:57:19 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 04:57:25 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <4537681F.8050801@perathoner.de> Bowerbird@aol.com wrote: > indeed, in fact, if memory serves me correctly, i left "that error" in > on purpose, precisely because it caused the page not to validate, > and i knew that would be upsetting to you technoid wieners here. Poor, poor, Bowerbird. We have misunderestimated you. It is uncanny how predictable you are. Every time somebody scores over you by attacking your "evidence" frontally, you go nonlinear for hours. > nah, the blind people will prefer the plain-text .zml version, > since there isn't any crap in there to gunk up screen-readers. And nothing to help screen readers ... -- Marcello Perathoner webmaster@gutenberg.org From grythumn at gmail.com Thu Oct 19 05:07:03 2006 From: grythumn at gmail.com (Robert Cicconetti) Date: Thu Oct 19 05:35:18 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: <15cfa2a50610190507u5776495fke06fc5a4c1d77ed3@mail.gmail.com> People keep replying to him. I've been tempted to make my filter match on body text too, but there have been a few occasions where the replies were interesting. R C On 10/19/06, Dave Doty wrote: > > > Okay, how did Bowerbird get around my killfile? > _________________________________________________________________ > All-in-one security and maintenance for your PC. Get a free 90-day trial! > > http://www.windowsonecare.com/purchase/trial.aspx?sc_cid=wl_wlmail_______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/fee6a313/attachment.html From desrod at gnu-designs.com Thu Oct 19 07:36:22 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 07:35:13 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <45376544.5000808@perathoner.de> References: <556.9857302.32682d39@aol.com> <45376544.5000808@perathoner.de> Message-ID: > Then you can upgrade to from HTML 4.01 transitional to XHTML 1.0 > strict (because OEB standard is based on XHTML). Careful there... For the Windows users, MSIE doesn't support proper XHTML _at all_ (but it does for the degraded XHTML-as-HTML doctype that 99% of the people who think they're using XHTML properly are declaring it as). XHTML is properly served as "application/xhtml+xml", but most people end up sending it as "text/html", which doesn't really improve the situation. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From desrod at gnu-designs.com Thu Oct 19 07:38:31 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 07:37:13 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: > babelfish has been going for years, and I have used it for years. > The translations given are absolutely awful. At best they are a > first pass for a human translator, or give you a general idea of > what the text is about. translate.google.com is fairly accurate with many of the unpopular languages, and I've had good success with it over the last couple of years. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From Bowerbird at aol.com Thu Oct 19 10:10:15 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 10:10:26 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: <490.ba2c344.32690b77@aol.com> marcello said: > Then you can start fixing the 428 side-by-side view pages: boo-ya! the fun will run into 2009! > (I don't know why there are 428 of them, > I would have thought 214 were enough. > You must know something I don't.) i know that end-users want _simplicity_ in referencing them, which means they might want to reference each pagespread using its left-page reference _or_ its right-page reference... > Then you can upgrade to from HTML 4.01 transitional > to XHTML 1.0 strict (because OEB standard is based on XHTML). as soon as jon noring joins our little open-source project here, i'm sure he'll be happy as a pig in shit to tell us how to do that... speaking of which, i'm sure jon is _so_ proud of me now that i'm engaging in an open-source project. just what he always wanted! > Then you can restructure your XHTML so it does not just "validate" > but makes sense semantically. again, jon noring will be our go-to guy on that. > At the same time you can build your CSS stylesheets. ditto. if it's a three-letter acronym, jon's got it covered! > Then you can add WAI accessibility features so screen readers > actually know what they are reading. hey, if you've got the "expert" that everyone is "inviting", then it would be stupid not to make good use of him, eh? > Then you'll be about half the way PGTEI is now. half-way! woo-hoo! so what they say about open-source is true! the march toward excellence just seems to happen automagically! > I guess, hoping that you'll stop bothering this list until you have > fixed your processes, is out of the question? i "fixed" page 3 so it "validates". as soon as you report the "error" on page 4, i'll put that on my list to "repair" tomorrow. so much fun! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/e7378a18/attachment.html From Bowerbird at aol.com Thu Oct 19 10:28:37 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 10:28:46 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: <594.40e032fd.32690fc5@aol.com> marcello said: > Every time somebody scores over you > by attacking your "evidence" frontally, > you go nonlinear for hours. you really believe you've "scored over me" with that lame validation crap, don't you? well, keep humoring yourself, marcello... > you go nonlinear for hours. nonlinear? well, i _sleep_ on occasion, but... *** meanwhile, other trolls have joined in now, with nothing more to contribute than to tell the hundreds of people subscribed here that _they_ have me in their killfile. bully for you. but your attempt to get people to stop paying attention by flooding their e-mailboxes with nothingness is so transparent it's easy to reveal. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/e7a230d3/attachment.html From desrod at gnu-designs.com Thu Oct 19 10:28:47 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 10:30:32 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <490.ba2c344.32690b77@aol.com> References: <490.ba2c344.32690b77@aol.com> Message-ID: <1161278927.6501.10.camel@localhost.localdomain> > boo-ya! the fun will run into 2009! > as soon as jon noring joins our little open-source project here, > i'm sure he'll be happy as a pig in shit to tell us how to do that... > speaking of which, i'm sure jon is _so_ proud of me now that i'm > engaging in an open-source project. just what he always wanted! > again, jon noring will be our go-to guy on that. > ditto. if it's a three-letter acronym, jon's got it covered! > hey, if you've got the "expert" that everyone is "inviting", > then it would be stupid not to make good use of him, eh? > half-way! woo-hoo! so what they say about open-source is true! > the march toward excellence just seems to happen automagically! > i "fixed" page 3 so it "validates". as soon as you report the "error" > on page 4, i'll put that on my list to "repair" tomorrow. so much > fun! I have to wonder if anyone has done a serious psychological profile on our fellow Bowerbird here, based on the various replies he's made over the years. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/d5613059/attachment.bin From Bowerbird at aol.com Thu Oct 19 10:44:20 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 10:44:48 2006 Subject: [gutvol-d] what it all boils down to Message-ID: <51a.593388a4.32691374@aol.com> i'm not sure what it was about my postings lately that has drawn the trolls out of the caves where they had been sequestered for some time, but i sure wish they'd go back. the silence was nice... at any rate, let's talk about what all this boils down to. when your e-book format is _complex_and_obtuse_, like .xml or .tei, it's difficult to program tools for it, and you need people with expertise to maintain it. that is the sad reality faced by the technoids here... when your file-format is _simple_, like my z.m.l., it is exceedingly easy to program tools for it, and even a large library of files can be maintained by an above-average 4th-grader. makes sense, right? it all boils down to costs and benefits, and their ratio. and it doesn't matter how much flak my detractors pitch at me, in the long run, it's _always_ going to boil down to costs and benefits, and their ratio... so beware the technoids who want you to buy their complex systems, and keep paying for them forever... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/8f253c28/attachment-0001.html From desrod at gnu-designs.com Thu Oct 19 10:57:14 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 10:58:25 2006 Subject: [gutvol-d] what it all boils down to In-Reply-To: <51a.593388a4.32691374@aol.com> References: <51a.593388a4.32691374@aol.com> Message-ID: <1161280634.6501.18.camel@localhost.localdomain> > when your e-book format is _complex_and_obtuse_, > like .xml or .tei, it's difficult to program tools for it, > and you need people with expertise to maintain it. > that is the sad reality faced by the technoids here... When your system is incompatible with standards and obtuse, you go reinvent your own format, and have to write tools from scratch to support it. It makes sense to make the format and the tools as simple as possible, to speed delivery. When you work with standards and simple formats like XML and XML-based derivitives, you can use the wealth of tools and other resources that others have written instead of reinventing your own wheel, simply because you refuse to acknowledge that others have done the work so you don't have to. > it all boils down to costs and benefits, and their ratio. I agree, which is why I personally use standards-compliant tools and methodologies, centered around a unified, agreed-upon format which can then be converted into any other format I wish. > and it doesn't matter how much flak my detractors > pitch at me, in the long run, it's _always_ going to > boil down to costs and benefits, and their ratio... Use what works for you, just don't presume that others have better ideas than your own or have done it in a different way which works for others. > so beware the technoids who want you to buy their > complex systems, and keep paying for them forever... And beware of propritary, unaccepted formats which exist in a vacuum without community or industry support, tools and resources to support them. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/8859814c/attachment.bin From Bowerbird at aol.com Thu Oct 19 11:37:07 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 11:37:13 2006 Subject: [gutvol-d] what it all boils down to Message-ID: david said: > When your system is incompatible with standards and obtuse, > you go reinvent your own format, and have to write tools from scratch > to support it. people invent new approaches all the time. the best of those get turned into "standards". (at least sometimes anyway. ok, _occasionally_.) but nothing can turn back an idea whose time has come, and the time for the idea of _simplicity_ is now upon us... soon it will be as hard to sell _complexity_ as it is to sell a _c.d._, and let us all observe a moment of silence now for tower records. > When you work with standards and simple formats > like XML and XML-based derivitives, you can use > the wealth of tools and other resources that > others have written i would think that jon and lee would appreciate your pointers to "the wealth of tools and other resources that others have written", since they're undertaking a project to reinvent the wheel yet again. heck, i think we'd _all_ appreciate those pointers, david... > I agree, which is why I personally use standards-compliant tools > and methodologies, centered around a unified, agreed-upon format > which can then be converted into any other format I wish. that seems like a sensible position. and i assume that it means that when the _simple_yet_fully_effective_ format is the one that is the "unified, agreed-upon format", that you will use it. of course you will, you're a practical man. so will everyone. when costs are low and benefits are high, the decision is a no-brainer. in the meantime, someone has to create that simple-yet-fully-effective format. i'm one of those someone's, but i'm not the only one, not at all. markdown, textile, wiki-formatting, crossmark, there's a bunch of 'em, all growing out of the philosophy that people don't wanna do markup. > Use what works for you gee, thanks for giving me your permission, david! :+) > just don't presume that others have better ideas than your own ok, i won't presume that others have better ideas than my own. (wha?) but i will be open to the possibility, if you don't mind, because otherwise i would become too inflexible, and i _like_ flexibility. besides, there have been too many cases in the past where someone else had a better idea than mine. (so i adopted it.) > or have done it in a different way which works for others. one of us is confused here. am i supposed to _not_ "presume" this? because it seems pretty obvious to me that some people do things "in a different way" than i do, and i _assume_ they are doing that because it "works" for them -- or at least that they _think_ it does. (which doesn't mean they're necessarily correct in that judgment.) > And beware of propritary, unaccepted formats which > exist in a vacuum without community or industry support, > tools and resources to support them. i guess "proprietary" is such a dirty word to you that you can't even bring yourself to spell it correctly, eh? maybe even took it out of your spellcheck dictionary? at any rate, i think a quick perusal of the "11 rules" of z.m.l. will quash the all-too-silly notion that z.m.l. is "proprietary". > http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt but then again, maybe today's patent office _would_ give me a patent on sterling ideas like "4 blank lines before a header" or "indent lines of the poem however much you want them to be indented, but use at least one space of indentation so we know that it's a _block_ and that it shouldn't be re-wrapped"... because those are some really earth-shaking ideas, right? hello gallileo, the catholics will surely condemn me too... yet would it not be sweet if i could demand a patent royalty from anyone who ever used 4 blank lines in a row? awesome! i'd be rich! rich, i tell you! filthy bloody rich! radical! ;+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/3c8a1843/attachment.html From marcello at perathoner.de Thu Oct 19 12:01:10 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 12:01:14 2006 Subject: [gutvol-d] what it all boils down to In-Reply-To: References: Message-ID: <4537CB76.9040200@perathoner.de> Bowerbird@aol.com wrote: > i guess "proprietary" is such a dirty word to you that > you can't even bring yourself to spell it correctly, eh? That's the second spelling flame by you today. > and, of course, ad hominem arguments are always cheap shots. > (Bowerbird, 01 Jan 2004) Spelling flames are ad hominem arguments. Funny. Isn't it? > hello gallileo, the catholics will surely condemn me too... ^^^^^^^^ Even funnier, when the spelling flamers go on, they usually knock themselves out. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Oct 19 12:08:08 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 12:08:12 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <1161278927.6501.10.camel@localhost.localdomain> References: <490.ba2c344.32690b77@aol.com> <1161278927.6501.10.camel@localhost.localdomain> Message-ID: <4537CD18.3070703@perathoner.de> David A. Desrosiers wrote: > I have to wonder if anyone has done a serious psychological profile on > our fellow Bowerbird here, based on the various replies he's made over > the years. Here, here! Me! Me! http://www.gnutenberg.de/bowerbird/ -- Marcello Perathoner webmaster@gutenberg.org From cannona at fireantproductions.com Thu Oct 19 12:09:45 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Oct 19 12:12:08 2006 Subject: [gutvol-d] what it all boils down to References: <51a.593388a4.32691374@aol.com> Message-ID: <006801c6f3b2$79666600$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bowerbird wrote: > > when your file-format is _simple_, like my z.m.l., > it is exceedingly easy to program tools for it, and > even a large library of files can be maintained by > an above-average 4th-grader. makes sense, right? Of course, the files will look like they were maintained by an above average 4th grader, because most 9-10 year olds, even if they are above average, probably won't quite grasp the importance of accurately representing complex tables, formulas, or accessibility. Of course, that won't matter because the format doesn't support them anyway. > it all boils down to costs and benefits, and their ratio. Exactly. However, a very small cost over an even smaller bennifit gives you a high ratio, which, in this context is undesirable. I know you would argue that the bennifits of ZML are high, but I would submit that if a tool doesn't do the job that you need done, then it is next to worthless. > so beware the technoids who want you to buy their > complex systems, and keep paying for them forever... Good advice. I would only add, beware those who extole the virtues of their system while ignoring, glossing over, or minimizing the importance of its flaws. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFN84JI7J99hVZuJcRAkVlAKDXGtTHFbKeNlYt8cJenhsEQnD4igCgkQAu WN+uJoEHfhRVLoid4Gf2Mc8= =XqVB -----END PGP SIGNATURE----- From Bowerbird at aol.com Thu Oct 19 12:31:02 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 12:31:09 2006 Subject: [gutvol-d] what it all boils down to Message-ID: marcello said: > Spelling flames are ad hominem arguments. no they aren't. if you "accused" someone of being a bad speller, that might be ad hominem. it would also be a pretty lame accusation, since i can't see how it would impact on the chain of their arguments... > Even funnier, when the spelling flamers go on, > they usually knock themselves out. 26,000+ hits on "gallileo" at google, so i went with it. of course, if i'd tried "galileo", i woulda got 26 _million_. live and learn. i also made a typo recently with "purse" for "pursue". i usually just mention spelling errors in passing, on the way to a bigger point, and usually only if they reveal something that i think is interesting, like lee stumbling over "authors" or david tripping up on "proprietary", funny things like that. but hey, i'll give you full credit for a spell-catch, marcello... :+) believe me, spelling means a whole helluva lot more to me than wc3 "validation", assuming my code works just fine... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/b2366f17/attachment.html From Bowerbird at aol.com Thu Oct 19 12:36:18 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 12:36:24 2006 Subject: [gutvol-d] what it all boils down to Message-ID: aaron said: > Of course, that won't matter because > the format doesn't support them anyway. aaron, i suggest you do something _constructive_, like make a wiki-page that lists all of the e-texts from the p.g. library that z.m.l. "cannot handle"... then instead of making these vague accusations, you can actually _point_to_ some solid _evidence_. that'll save you time, and be much more effective... it will also be a lot more _fair_, though that doesn't seem to be something that you care about very much, because i'll be able to point out where you're wrong... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/18cd4fb4/attachment.html From marcello at perathoner.de Thu Oct 19 13:13:22 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 13:13:26 2006 Subject: [gutvol-d] [Fwd: [webgroup] Database server upgrade notice] Message-ID: <4537DC62.7050706@perathoner.de> The PG catalog resides on this server. The wiki will not be affected. -------- Original Message -------- Subject: [webgroup] Database server upgrade notice Date: Thu, 19 Oct 2006 15:56:53 -0400 (EDT) From: Ken Chestnutt To: webgroup@lists.ibiblio.org CC: ibiblio-announce@lists.ibiblio.org Dear ibiblio users, We would like to thank you for hosting your content with us. We have been steadily growing over the past few months in both size and variety of content. It is because of this growth that we need to do some maintenance. We will be upgrading one of our database servers Sunday night. We are scheduling a three hour downtime, starting at 9:00 p.m. EDT on Sunday, October 22. This downtime will not affect everyone. It will only affect users who have a database on the server mysql2. If you do not have a database or if your database is on mysql.ibiblio.org, you will remain unaffected by this upgrade. If you have any questions, please send us email at "help@ibiblio.org". Thanks Ken Chestnutt ibiblio.org _______________________________________________ webgroup mailing list webgroup@lists.ibiblio.org http://lists.ibiblio.org/mailman/listinfo/webgroup -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Oct 19 13:26:22 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Oct 19 13:26:26 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <594.40e032fd.32690fc5@aol.com> References: <594.40e032fd.32690fc5@aol.com> Message-ID: <4537DF6E.5070205@perathoner.de> Bowerbird@aol.com wrote: > marcello said: >> you go nonlinear for hours. > > nonlinear? well, i _sleep_ on occasion, but... nonlinear: adj. [scientific computation] 1. Behaving in an erratic and unpredictable fashion; unstable. When used to describe the behavior of a machine or program, it suggests that said machine or program is being forced to run far outside of design specifications. This behavior may be induced by unreasonable inputs, or may be triggered when a more mundane bug sends the computation far off from its expected course. 2. When describing the behavior of a person, suggests a tantrum or a flame. ?When you talk to Bob, don't mention the drug problem or he'll go nonlinear for hours.? In this context, go nonlinear connotes ?blow up out of proportion? (proportion connotes linearity). ---- http://www.catb.org/~esr/jargon/html/N/nonlinear.html -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Thu Oct 19 13:39:27 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 13:39:46 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: marcello said: > This behavior may be induced by unreasonable inputs well, there you have it! > When describing the behavior of a person, > suggests a tantrum or a flame. i'm confident the hundreds of lurkers here can tell -- without any difficulties at all -- who is being rational, solid, and thoughtful, and who is not. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/1203ee05/attachment.html From Bowerbird at aol.com Thu Oct 19 13:51:30 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 13:51:41 2006 Subject: [gutvol-d] leeward -- a first-draft open-source pass at "babelfish" Message-ID: i said: > the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is > simple enough when the file-format is something as simple as .zml. i know it can be infuriating when someone says "oh that's simple", and hey, well maybe it isn't easy for you, for one reason or another. and i certainly wouldn't want to be infuriating... so lee, what i've done is to create "leeward", a nice little text-editor that is a first-draft pass on the specifications you laid out for us... no, this doesn't mean that i'm gonna start _programming_ on our little open-source effort here, that's _your_ job, i'm the _designer_, but i figure it can't hurt to do just a little bit to help you get started. besides, i didn't even write this program, i just took the code for a sample app that realbasic distributes freely with their compiler. however, i did change the label on the "italics" toggle from "i" to "e" -- for "emphasis", of course -- and likewise the label on the "bold" toggle from "b" to "s" (for "strong") to reflect the x.m.l. philosophy. i thought you'd appreciate the sensitivity of that nice little touch... (since i really don't know how to _show_ "emphasis" other than by using italics, and likewise bold for "strong", i just left that styling in. i hope that's ok. think of it as a headstart on the c.s.s.) anyway, lee, i compiled versions for o.s.x., windows, and linux, and i would be happy to e-mail you whichever ones you'd like to see... i'll even upload them to the web if anyone else wants to see them... to get the source-code, you can just download it from realbasic. > http://www.realsoftware.com it's the "text-editor" sample code. let me know if you can't find it. yes, lee, i know you'll probably use some other language/compiler. that's fine. because this is just a little nudge to help you get going, to show you how simple it can be. and i'm sure that whatever tool you choose to use to do your programming, you'll be able to find some simple text-editor code to serve as the springboard for you... anyway, lee, despite your seeming reluctance, i know that eventually you will come around to our little project. after all, it's open-source! so how can you resist? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/1ebd40f4/attachment-0001.html From brett at dimetrodon.demon.co.uk Thu Oct 19 14:23:51 2006 From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar) Date: Thu Oct 19 14:25:06 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: Bowerbird@aol.com writes >marcello said: >>?? This behavior may be induced by unreasonable inputs > >well, there you have it! > > >>?? When describing the behavior of a person, >>?? suggests a tantrum or a flame. > >i'm confident the hundreds of lurkers here >can tell -- without any difficulties at all -- >who is being rational, solid, and thoughtful, >and who is not. As a lurker I would like to say I have developed the overwhelming impression that you are a pointless posturing idiot who will never develop a single piece of useful software or indeed ever contribute anything of value. On zml as far as I can see it is a feature deficient mark up with only one, largely pointless, advantage over a very basic xml, namely the source code looks nicer when viewed in notepad or similar, at the cost of making the source code a lot harder to edit. It has a very limited feature set and it is very hard to extend that set if it turns out you need additional features. It also tends to mean that it is easy for a typo to result in code that validates but is incorrect, e.g. three carriage returns rather than four in zml is valid but incorrect while accidentally replacing "

" with "

" is much harder -- Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm Livejournal http://brett-dunbar.livejournal.com/ Brett Paul Dunbar To email me, use reply-to address From Bowerbird at aol.com Thu Oct 19 14:53:35 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 14:53:46 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: brett said: > I have developed the overwhelming impression that > you are a pointless posturing idiot who will never > develop a single piece of useful software or > indeed ever contribute anything of value. if you have specific feedback about my software, i'd love to hear it. otherwise, i get the impression that you haven't even looked at it. am i incorrect? > On zml as far as I can see it is a feature deficient mark up again, if you mentioned features in which it is "deficient", your rant would have less emotionality and more power... > with only one, largely pointless, advantage over > a very basic xml, namely the source code looks nicer > when viewed in notepad or similar well, it _does_ look nicer in a text-editor, thanks for noticing. but the real difference is that it is endowed with a number of e-book capabilities when loaded into a zml-viewer-program. even if you can get those same capabilities with a more complex format, the question is why you'd pay the extra cost of complexity for the same set of benefits. i can't think of _any_ good reason... > at the cost of making the source code a lot harder to edit. i don't see how you can even suggest that editing .zml would be "a lot harder" than editing .xml. that's ridiculous on the face of it. > It has a very limited feature set what are the limitations? > and it is very hard to extend that set > if it turns out you need additional features. no, i've found it's simple to extend the idea to get new features. that's why i keep asking for suggestions, so if anyone comes up with something that i think needs to be added, i can do it now... > It also tends to mean that it is easy for a typo to result in > code that validates but is incorrect first, the concept of "validation" doesn't apply to z.m.l. sorry. z.m.l. works the way it works. if you don't get what you want, then you need to rework your file until you get what you want. fortunately, it's easy to "rework your file" to get what you want, especially since there is such a small number of simple "rules"... > three carriage returns rather than four in zml is valid but incorrect 3 empty lines won't give you a _header_, so if that's what you want, yes, you'll need to add another empty line. but since the wysiwyg display will show you -- clearly -- that you do not have a header, it'll be obvious to you that you need to put in more empty lines... furthermore, the "contents" listbox will let you bump up (or down) the _level_ of each header, in the style of "outliner" applications, and display them _as_ an outline, so i think you need a better example... besides, for the people who need to see a more _visible_ form of header-markup, i _might_ also support _atx_ header markup: +++ this would be a level-three header. ++ this would be a level-two header. + this would be a level-one header. *** i'm also considering the se-text style of headers: this is a level-one header ========================== and this is a level-two header ------------------------------ *** i'm still undecided on those, since it seems to me that the number-of-blank-lines method is easy enough to grok, and it represents the "invisible" ideal of zen markup best, but when i do decide, i promise to announce it here _first_. *** so anyway, brett, thanks for your feedback. but more precision, next time, if you please. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/f3e4ddad/attachment.html From cannona at fireantproductions.com Thu Oct 19 15:46:12 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Oct 19 15:47:31 2006 Subject: [gutvol-d] what it all boils down to References: Message-ID: <026a01c6f3d0$8f1a4b10$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bowerbird Wrote: > aaron, i suggest you do something _constructive_, > like make a wiki-page that lists all of the e-texts > from the p.g. library that z.m.l. "cannot handle"... > > then instead of making these vague accusations, > you can actually _point_to_ some solid _evidence_. > that'll save you time, and be much more effective... Do things that you've said count for evidence in your opinion? "since google will be scanning 10-40 million books, there are plenty of non-math texts for me to work on before i do the math books..." "> Sure we do. We use TeX (or pseudo-TeX fragments). and that's why that's what i'll probably do as well, when the time comes that i feel that it's necessary, because that's my modus operandi, to utilize the existing conventions, to best leverage current work." "but for now, i'm not at all worried about this 'problem'." "ultimately, for music, i will probably do exactly what d.p. has done, and use lilypond or finale, and route that file to either an external player or one that i have embedded in my viewer-app. unlike music-markup-language, lilypond shares my core philosophy of simplicity and elegance... "i'll follow the same approach for math equations, routing them to an equation editor that is either (a) an external app, or (b) embedded in my viewer. i'd guess it will probably be tex-based rather than math-markup-language, as tex is widely preferred, and expressible in utf-8. (math-markup-language is also expressible in utf-8, but it's also got all that angle-bracket gunk in it, which i'm badly allergic to.)" "meanwhile, for the 99.7% of the project gutenberg library which currently has no need _at_all_ (let alone any _compelling_ need) for math equations, i don't have to worry about them, thank you." " my own system will eventually be able to handle quite complex tables, when i find the need to develop it that far." "there aren't a whole lot of tables in the e-texts -- we're talking literature, not spreadsheets -- but your system should handle tables anyway; not really big and hairy ones, just simple ones" "my own system will eventually be able to handle quite complex tables, when i find the need to develop it that far. and if you'd like some proof, then hand me a list of 100 e-texts that use tables, and i will tackle them first when the time for "attacking tables" comes up big on my agenda... (and leave out the spalding baseball guides, i already know about them.)" So, either you are really confused, or your format doesn't support complex tables and mathematics. I am curious though; if you already know that your format doesn't support these things, why do you need us to find the ebooks that contain them for you? Is it that you are unwilling to do the work your self? Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOACCI7J99hVZuJcRAt9tAKCYYAxarzSpmMOPFpBt1VTHNGl+JACg9mNR CY3Yy+m3G8udKoBy623InT8= =V2un -----END PGP SIGNATURE----- From brett at dimetrodon.demon.co.uk Thu Oct 19 15:53:33 2006 From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar) Date: Thu Oct 19 15:54:49 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: Bowerbird@aol.com writes >brett said: >>?? I have developed the overwhelming impression that >>?? you are a pointless posturing idiot who will never >>?? develop a single piece of useful software or >>?? indeed ever contribute anything of value. > >if you have specific feedback about my software, >i'd love to hear it.? otherwise, i get the impression >that you haven't even looked at it.? am i incorrect? > > >>?? On zml as far as I can see it is a feature deficient mark up > >again, if you mentioned features in which it is "deficient", >your rant would have less emotionality and more power... Mathematical formulae, exotic typography, support for features impossible in print. > >>?? with only one, largely pointless, advantage over >>?? a very basic xml, namely the source code looks nicer >>?? when viewed in notepad or similar > >well, it _does_ look nicer in a text-editor, thanks for noticing. > >but the real difference is that it is endowed with a number of >e-book capabilities when loaded into a zml-viewer-program. All of which can be done in a basic xml just as easily. As a format that has the major advantage of being designed to allow the addition of extra features to the format if they become necessary. >even if you can get those same capabilities with a more complex >format, the question is why you'd pay the extra cost of complexity >for the same set of benefits.? i can't think of _any_ good reason... Future proofing, you can simply not use a feature that is present that you don't need, while you have real problems if you need a feature that isn't there. >>?? at the cost of making the source code a lot harder to edit. > >i don't see how you can even suggest that editing .zml would be >"a lot harder" than editing .xml.? that's ridiculous on the face of it. The mark-up is largely invisible, being white space, and therefore hard to see, some of it relies on counting carriage returns, which is hard to do by eye. While adding a few angle brackets to a piece of text is pretty simple. > >>?? It has a very limited feature set >what are the limitations? It only represents a fairly limited range of formatting, for example there is no clear way of representing tables complex formulae or various forms of exotic formatting, e.g. in _Jingo_ Terry Pratchett uses an Arabic-looking font to represent Klatchian he has one Captain Carrot use some words mostly in that font but with the h in the normal font, as Carrot is mispronouncing the h. I very much doubt zml could handle that, nonetheless it is required for that book. > >>?? and it is very hard to extend that set >>?? if it turns out you need additional features. > >no, i've found it's simple to extend the idea to get new features. >that's why i keep asking for suggestions, so if anyone comes up >with something that i think needs to be added, i can do it now... That isn't the point, the point is if a feature turns out to be needed at some point in the future, and has been omitted from the original spec, xml can be extended to include it in a straightforward manner zml can't. > >>?? It also tends to mean that it is easy for a typo to result in >>?? code that validates but is incorrect > >first, the concept of "validation" doesn't apply to z.m.l.? sorry. The point of a validator is to attempt to catch things like a part header being accidentally shown as a chapter header automatically, this makes it easier to find and fix errors. >z.m.l. works the way it works.? if you don't get what you want, >then you need to rework your file until you get what you want. > >fortunately, it's easy to "rework your file" to get what you want, >especially since there is such a small number of simple "rules"... > > >>?? three carriage returns rather than four in zml is valid but >incorrect > >3 empty lines won't give you a _header_, so if that's what you want, >yes, you'll need to add another empty line.? but since the wysiwyg >display will show you -- clearly -- that you do not have a header, >it'll be obvious to you that you need to put in more empty lines... If you are using a special application to edit it anyway there is no reason to obscure the mark-up in the source code. With xml you can also edit in something like notepad rather more easily. That is also requiring you to spot manually an error that an xml validator could catch automatically. > >furthermore, the "contents" listbox will let you bump up (or down) the >_level_ of each header, in the style of "outliner" applications, and >display them _as_ an outline, so i think you need a better example... Tom's ebookreader has long done something like that. > >besides, for the people who need to see a more _visible_ form >of header-markup, i _might_ also support _atx_ header markup: > >+++ this would be a level-three header. > >++ this would be a level-two header. >+ this would be a level-one header. I don't really see this as having any detectable advantage over the xml style angle brackets. They have the disadvantage that a typo will still be valid but incorrect, while a similar typo is liable to produce invalid xml. The angle bracket system also allows the easy addition of further format types at a later date. >*** >i'm also considering the se-text style of headers: >this is a level-one header >========================== > >and this is a level-two header >------------------------------ > >*** > >i'm still undecided on those, since it seems to me that >the number-of-blank-lines method is easy enough to grok, >and it represents the "invisible" ideal of zen markup best, >but when i do decide, i promise to announce it here _first_. Blank lines are easy to understand, they are hard to edit, which is the problem. -- Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm Livejournal http://brett-dunbar.livejournal.com/ Brett Paul Dunbar To email me, use reply-to address From Bowerbird at aol.com Thu Oct 19 16:12:48 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 16:12:56 2006 Subject: [gutvol-d] what it all boils down to Message-ID: <42c.8147b3d.32696070@aol.com> aaron said: > So, either you are really confused, > or your format doesn't support complex tables and mathematics. i'm not confused in the slightest, aaron. z.m.l. supports math equations via graphics. z.m.l. supports "complex tables" by advising that they be broken down to "simple tables" instead... this kind of support is sufficient for the time being. however, if you were to show me some actual e-texts that are currently in the p.g. library for which you would like to see more extensive support, i'd be happy to look at those e-texts and tell you what i would consider doing. if, on the other hand, you just want to make some vague claims about what z.m.l. is "incapable" of handling, then i won't bother to even respond to your e-mails any more, for obvious reasons... > I am curious though; if you already know that your format > doesn't support these things, why do you need us to find > the ebooks that contain them for you? i am pretty familiar with what's in the library... i think the methodology i have in place already is more than sufficient for handling what's there. but i'm more than willing to entertain the possibility that there are e-texts that it would be good for me to take a close hard look at, so if you have any pointers in that regard, i would be most happy to receive them. otherwise, i'll assume you can't find anything that would be too difficult to handle, and i'll just continue on with my present plans for my survey of the total library as scheduled. > Is it that you are unwilling to do the work your self? no. as i said, i'll get around to the whole library eventually. in the meantime, i'm just calling your bluff. so you can either show your cards, or keep on stalling. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/eaac7e20/attachment-0001.html From Bowerbird at aol.com Thu Oct 19 16:27:41 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 16:27:49 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: brett said: > Mathematical formulae graphics. [heavy sigh] > exotic typography graphics. > support for features impossible in print. such as? > All of which can be done in a basic xml just as easily. "just as easily" is a decision that end-users will make. > Future proofing yeah, right. we've had, what, 4 versions of .html in the last decade, and even that is now outdated? x.m.l. advocates seem to have a _very_ short memory. that's not what i see as being good for "future proofing". > some of it relies on counting carriage returns, > which is hard to do by eye. that's why you have the computer do it for you, and show the results in a starkly obvious way... > I very much doubt zml could handle that, > nonetheless it is required for that book. let me know when that book hits the p.g. library, because i'm quite sure i'll be able to handle it then. or alternately, show me the x.m.l. "solution" for it. > That isn't the point, the point is > if a feature turns out to be needed at some point in the future, > and has been omitted from the original spec, > xml can be extended to include it in a straightforward manner > zml can't. of course z.m.l. can add new features when needed. > header being accidentally shown as a chapter header automatically, > this makes it easier to find and fix errors. i build that capability into the authoring-tool, where it's most useful. > If you are using a special application to edit it anyway > there is no reason to obscure the mark-up in the source code. except because obfuscatory mark-up is ugly, that's all. not to mention obfuscatory. and that it gets in the way. > With xml you can also edit in something like notepad rather more easily. here again you say something that's ridiculous on its face. editing x.m.l. in notepad. yeah, right. that's the solution! i'm perplexed why lee and jon noring never thought of that! > Tom's ebookreader > has long done something like that. very few things in z.m.l. are unprecedented. that's the point of it. > I don't really see this as having any detectable advantage over the xml then i suggest you stick with x.m.l., brett. believe me, that's o.k. with me! > Blank lines are easy to understand, they are hard to edit, > which is the problem. i don't know about _your_ keyboard, brett, but mine has this big key that is clearly labeled "return" in a prominent place... and there's another one -- labeled "enter" -- in the corner... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/431399a4/attachment.html From desrod at gnu-designs.com Thu Oct 19 16:35:56 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 16:36:36 2006 Subject: [gutvol-d] leeward -- a first-draft open-source pass at "babelfish" In-Reply-To: References: Message-ID: <1161300956.6501.24.camel@localhost.localdomain> On Thu, 2006-10-19 at 16:51 -0400, Bowerbird@aol.com wrote: > anyway, lee, despite your seeming reluctance, i know that eventually > you will come around to our little project. after all, it's > open-source! so how can you resist? If its truly "open source", where is the code that YOU changed? Pointing back to the upstream source is not sufficient to comply with most OSI approved licenses. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/9dbbcc90/attachment.bin From desrod at gnu-designs.com Thu Oct 19 16:36:28 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Thu Oct 19 16:37:36 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <1161300988.6501.26.camel@localhost.localdomain> On Thu, 2006-10-19 at 16:39 -0400, Bowerbird@aol.com wrote: > i'm confident the hundreds of lurkers here > can tell -- without any difficulties at all -- > who is being rational, solid, and thoughtful, > and who is not. Of that, I have no doubt. ;) -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/324b8063/attachment.bin From cannona at fireantproductions.com Thu Oct 19 16:42:46 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Oct 19 16:42:48 2006 Subject: [gutvol-d] what it all boils down to References: <42c.8147b3d.32696070@aol.com> Message-ID: <006501c6f3d8$493b5af0$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bowerbird wrote: > i'm not confused in the slightest, aaron. > > z.m.l. supports math equations via graphics. > > z.m.l. supports "complex tables" by advising that > they be broken down to "simple tables" instead... > > this kind of support is sufficient for the time being. > > however, if you were to show me some actual e-texts > that are currently in the p.g. library for which you would > like to see more extensive support, i'd be happy to look > at those e-texts and tell you what i would consider doing. I stand corrected. Under your definition of "support", it appears that you can support any book in the PG library because after all, any book with complex formatting or using special notation can simply be displayed via graphics. If the pages are too wide to fit in a single image, that's ok, because ZML "supports" them by suggesting that they be broken down into simpler images. So, as a result of our new definition of support, I am forced to state unequivocally that ZML can support anything in the PG library, but it does so very very poorly. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOA15I7J99hVZuJcRAnUpAKC9jO7kk26d/WqVWAVLl2YIvrLJJgCgl8BO NI4wOvZL7Bq598J4ut7vnwc= =txfq -----END PGP SIGNATURE----- From Bowerbird at aol.com Thu Oct 19 17:41:18 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 17:41:25 2006 Subject: [gutvol-d] what it all boils down to Message-ID: aaron said: > but it does so very very poorly. and -- again -- if you point me to the books where it does "very very poorly", maybe then i'll do something to "improve" its performance. so are you going to just throw in your cards, or are you gonna show us what's in your hand? because so far you haven't proven jack... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/3cf3a28f/attachment.html From Bowerbird at aol.com Thu Oct 19 17:45:36 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 17:45:42 2006 Subject: [gutvol-d] leeward -- a first-draft open-source pass at "babelfish" Message-ID: david- i changed the labels on those two buttons. that's it. if people are interested in this text-editor program, i'll rewrite it from scratch, clean-room style, so its lineage will be spot-free. and you can be in charge of making sure it "complies" with all of the legalities. but since there are few (if any) realbasic programmers here on this list, i'll assume that won't be necessary... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/bc5e2233/attachment.html From Bowerbird at aol.com Thu Oct 19 17:48:57 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Oct 19 17:49:09 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: david said: > Of that, I have no doubt. ;) at last we have an island of agreement! :+) of course, as it should be clear to everyone, the issues won't be decided on this listserve. they'll be decided by users out in the real world, making decisions based on costs and benefits... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/bfdb7342/attachment.html From cannona at fireantproductions.com Thu Oct 19 18:41:32 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Oct 19 18:41:38 2006 Subject: [gutvol-d] what it all boils down to References: Message-ID: <00ed01c6f3e8$e2f0c530$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Perhaps an analogy will help you understand: Person a: I just designed this great new shark bite suit that I think is just swell. Person b: Oh really? Person a: Yep. The beauty of the design is that its so simple, anyone can build it, even an above average 4th grader. Once people realize how utterly cool and effective my design is, everyone will want one. Person B: Can I have one so I can take a look? Person a: Sorry, no. But I will tell you how to build one. There are just thirteen simple rules. Person b, after reviewing the rules: I can think of several situations where this suit won't be able to handle a shark attack. Person a: that's not true! My suit offers shark attack protection that is strong, inclusive, and exhaustive. I refuse to acknowledge any problems unless you give me an example. Person b: Well, for one thing it's made of plastic wrap and rubber bands, so I think the teeth of the shark are going to just cut right through it when he tries to bite you. Person a: You're wrong. My shark suit supports shark bites by advising that the user swim really really fast. Person b: That's ridiculous. Your supposed support for shark bites is no support at all. If it is, it's very very poor support. Person a: Show me where it has offered very very poor support for shark bites and maybe then i'll do something to "improve" its performance. so are you going to just throw in your cards, or are you gonna show us your wounds? Person b: If you can't intuitively understand how plastic wrap and swimming faster is not sufficient protection against shark teeth, then my showing you an example of it not working isn't going to help you. Get it? If you can't intuitively understand that breaking tables into smaller tables or displaying equations as graphics is really no support at all, then my showing you an example of it not working isn't going to help you. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: To: ; Sent: Thursday, October 19, 2006 7:41 PM Subject: re: [gutvol-d] what it all boils down to > aaron said: >> but it does so very very poorly. > > and -- again -- if you point me to the books > where it does "very very poorly", maybe then > i'll do something to "improve" its performance. > > so are you going to just throw in your cards, > or are you gonna show us what's in your hand? > > because so far you haven't proven jack... > > -bowerbird > - -------------------------------------------------------------------------------- > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOClTI7J99hVZuJcRAjaNAKC4GSyWe6WuYDswhh0M1HlA8o2JMwCfZc8z h2mrB5qg9Z31vQElAFJuOq8= =m4xx -----END PGP SIGNATURE----- From Bowerbird at aol.com Fri Oct 20 00:54:17 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 00:54:26 2006 Subject: [gutvol-d] what it all boils down to Message-ID: when i "call" you, you're supposed to show me your cards, not tell me a story about a shark... i think you are now _firmly_ on the record that z.m.l. won't work... thanks for putting yourself _firmly_ on the record, aaron. time will tell... yes, time will tell... as it is, you exhausted my patience -- which is a difficult thing to do! -- so i won't likely be responding to you again any time soon, just so you know. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f69ef366/attachment.html From Bowerbird at aol.com Fri Oct 20 02:03:35 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 02:03:46 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 Message-ID: <4bf.2aa73da.3269eae7@aol.com> here's the perl version for babelfish assignment 01, which was to read a file in and spit it to a webpage with [pre], a simple task accomplished in 11 lines... next time we'll learn about the "split" command, and break out the text for each individual page... and if someone would tell me how to do a "split" on a sequence of multiple line-endings, that'd be great. i assumed it would be something like this: > @thesections=split('\n\n\n\n\n',$thebook); but that doesn't appear to be working for me. so any help for this beginning perl script-kiddie would be greatly appreciated, thank you much, isn't open-source swell, all praise collaboration. -bowerbird ---------------------------------------------------------------------- #!/usr/bin/perl $filename="/home2/yoursiteinfo/public_html/myant/myant.zml"; open (inf,"$filename") or print "that file was not available...

\n"; read (inf,$thebook,2222222); close inf; print "content-type: text/html\n\n"; print ''; print ''; print ' '; print "my antonia!"; print "

";
print $thebook;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f178629e/attachment.html
From schultzk at uni-trier.de  Fri Oct 20 02:15:51 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Fri Oct 20 02:15:57 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: 

Hi There,

	Do to its nature multi-line parsing or splitting is not quite that easy
	in perl. If you are starting out with perl I would suggest getting  
the Book
	"Perl for Dummies". A great introduction and plenty of examples. You  
will
	find your answer there!!

	split is nice. But you want to be doing parsing which is a art in  
its own
	right. I am sure you are up to it.

	Just for the fun of it your script is incomplete.

		reagards
			Keith.

Am 20.10.2006 um 11:03 schrieb Bowerbird@aol.com:

> here's the perl version for babelfish assignment 01,
> which was to read a file in and spit it to a webpage
> with [pre], a simple task accomplished in 11 lines...
>
> next time we'll learn about the "split" command,
> and break out the text for each individual page...
>
> and if someone would tell me how to do a "split" on
> a sequence of multiple line-endings, that'd be great.
>
> i assumed it would be something like this:
> >   @thesections=split('\n\n\n\n\n',$thebook);
>
> but that doesn't appear to be working for me.
> so any help for this beginning perl script-kiddie
> would be greatly appreciated, thank you much,
> isn't open-source swell, all praise collaboration.
>
> -bowerbird
> ----------------------------------------------------------------------
>
> #!/usr/bin/perl
> $filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
> open (inf,"$filename") or print "that file was not available...

\n"; > read (inf,$thebook,2222222); close inf; > print "content-type: text/html\n\n"; > print ' en">'; > print ''; > print ''; > print "my antonia!"; > print "

";
> print $thebook;
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/ca153b6a/attachment.html
From marcello at perathoner.de  Fri Oct 20 03:10:16 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 03:10:20 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: <4538A088.2080307@perathoner.de>

Bowerbird@aol.com wrote:

> #!/usr/bin/perl
> $filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
> open (inf,"$filename") or print "that file was not available...

\n"; > read (inf,$thebook,2222222); close inf; > print "content-type: text/html\n\n"; > print ''; > print ''; > print ' > '; > print "my antonia!"; > print "

";
> print $thebook;

BUAHAHAHAHAHA !

Ever considered upgrading your 4th grader ?

Or, at least, don't let him/her write your code.




-- 
Marcello Perathoner
webmaster@gutenberg.org

From brett at dimetrodon.demon.co.uk  Fri Oct 20 03:44:32 2006
From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar)
Date: Fri Oct 20 03:46:43 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: 
References: 
Message-ID: 

Bowerbird@aol.com writes
>brett said:
>>?? Mathematical formulae
>
>graphics.? [heavy sigh]
>
A very, very stupid approach, xml has support for various methods of 
doing the job properly.

>>?? exotic typography
>graphics.

A very, very stupid approach, xml has support for various methods of 
doing the job properly and additional support can be added.

>
>>?? support for features impossible in print.
>such as?

Interactive tables, with sorting, functioning calculations, currency 
conversions, dynamically altering perspective on 3D graphs, clicking on 
a diagram to take you to the section on that component &c.

>
>>?? All of which can be done in a basic xml just as easily.
>
>"just as easily" is a decision that end-users will make.
>
>
>>?? Future proofing
>
>yeah, right.? we've had, what, 4 versions of .html
>in the last decade, and even that is now outdated?

Each version has been a superset of the previous one, stuff has been 
added over time as the fundamental design of the format makes this 
fairly easy.

>x.m.l. advocates seem to have a _very_ short memory.
>that's not what i see as being good for "future proofing".
>

That is what makes the format future proof. If a feature turns out to be 
needed then it can be added later. As has happened several times with 
html, people came up with ideas the earlier designers hadn't thought of.

You seem to have the arrogance to believe that you can think of every 
needed feature at the outset, xml is designed to allow the 
straightforward addition of new features that never occurred to the 
original designers.

>>?? some of it relies on counting carriage returns,
>>?? which is hard to do by eye.
>
>that's why you have the computer do it for you,
>and show the results in a starkly obvious way...
>
>

If you aren't doing it by hand what possible advantage does using an 
editor producing zml have over one producing basic xml?

Why is four carriage returns easier than 

? >>?? I very much doubt zml could handle that, >>?? nonetheless it is required for that book. > >let me know when that book hits the p.g. library, >because i'm quite sure i'll be able to handle it then. > I want to know can you plan to handle having a single letter in a word in a different font than the rest of the word, xml can, I am certain that zml cannot. >or alternately, show me the x.m.l. "solution" for it. Something like "watchman". > >>?? That isn't the point, the point is >>?? if a feature turns out to be needed at some point in the future, >>?? and has been omitted from the original spec, >>?? xml can be extended to include it in a straightforward manner >>?? zml can't. > >of course z.m.l. can add new features when needed. > > There is no way of indicating that the following is mark-up and then allowing an arbitary string, in xml anything in angle brackets is markup this makes adding new types of mark-up simple. >>?? header being accidentally shown as a chapter header automatically, >>?? this makes it easier to find and fix errors. > >i build that capability into the authoring-tool, where it's most >useful. > > >>?? If you are using a special application to edit it anyway >>?? there is no reason to obscure the mark-up in the source code. > >except because obfuscatory mark-up is ugly, that's all. >not to mention obfuscatory.? and that it gets in the way. If properly written it is not obfucatory, the mark-up in simple cases is obvious as it is simple contained within angle brackets, which clearly distinguishes the mark up from the text. The advantage of xml is it can also do the complicated stuff that zml cannot, that looks complicated because it is complicated. >>?? With xml you can also edit in something like notepad rather more >easily. > >here again you say something that's ridiculous on its face. > >editing x.m.l. in notepad.? yeah, right.? that's the solution! >i'm perplexed why lee and jon noring never thought of that! If editing simple texts i.e. the kind that zml can handle there aren't actually a lot of tags to keep track of, italics, bold, underline a few layers of headers, pictures and footnotes all of which can be dealt with by very basic xml. >>?? Tom's ebookreader > >>?? has long done something like that. > >very few things in z.m.l. are unprecedented.? that's the point of it. > > >>?? I don't really see this as having any detectable advantage over the >xml > >then i suggest you stick with x.m.l., brett.? believe me, that's o.k. >with me! > > >>?? Blank lines are easy to understand, they are hard to edit, >>?? which is the problem. > >i don't know about _your_ keyboard, brett, but mine has this >big key that is clearly labeled "return" in a prominent place... >and there's another one -- labeled "enter" -- in the corner... > The problem is that it is hard to see the mark-up and harder to find the errors, as they rarely break validation. Four CRs rather than the CRs would still validate while would be an error that a validator would catch. -- Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm Livejournal http://brett-dunbar.livejournal.com/ Brett Paul Dunbar To email me, use reply-to address From marcello at perathoner.de Fri Oct 20 03:54:46 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 03:54:51 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: <4bf.2aa73da.3269eae7@aol.com> References: <4bf.2aa73da.3269eae7@aol.com> Message-ID: <4538AAF6.9040809@perathoner.de> Bowerbird@aol.com wrote: > #!/usr/bin/perl > $filename="/home2/yoursiteinfo/public_html/myant/myant.zml"; > open (inf,"$filename") or print "that file was not available...

\n"; > read (inf,$thebook,2222222); close inf; > print "content-type: text/html\n\n"; > print ''; > print ''; > print ' > '; > print "my antonia!"; > print "

";
> print $thebook;

Show this to your 4th grader, so he/she won't stay a 4th grader forever.

Oh! And this actually takes the file name from the command line instead
of hardcoding it into the program. How to extract the title out of the
file is left as an exercise for the 4th grader.


#!/usr/bin/perl

# slurp whole file, mem is cheap
undef $/;
$text = <>;

print <

  
    
    my antonia!
  
  
    
$text
HERE -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Fri Oct 20 04:22:51 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 04:22:55 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: <4bf.2aa73da.3269eae7@aol.com> References: <4bf.2aa73da.3269eae7@aol.com> Message-ID: <4538B18B.3010000@perathoner.de> Bowerbird@aol.com wrote: > and if someone would tell me how to do a "split" on > a sequence of multiple line-endings, that'd be great. Why don't you treat yourself to "Perl for Trolls" for halloween? $ man perlre ... Matching operations can have various modifiers. Modifiers that relate to the interpretation of the regular expression inside are listed below. ... s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match. ... #!/usr/bin/perl undef $/; # slurp whole file, mem is cheap $text = <>; $text =~ s/\r//g; # fix brain-dead M$-DOS and Mac line endings @chapters = split (/\n{5}/s, $text); # 5 or whatever for (@chapters) { # do something with chapters, like printing print "CHAPTER:\n\n$_\n\n"; } -- Marcello Perathoner webmaster@gutenberg.org From bill at williamtozier.com Fri Oct 20 05:06:28 2006 From: bill at williamtozier.com (William Tozier) Date: Fri Oct 20 05:06:40 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: <1161300988.6501.26.camel@localhost.localdomain> References: <1161300988.6501.26.camel@localhost.localdomain> Message-ID: <8B510152-ED4D-4EA0-9B77-1CAB179464A9@williamtozier.com> On Oct 19, 2006, at 7:36 PM, David A. Desrosiers wrote: > On Thu, 2006-10-19 at 16:39 -0400, Bowerbird@aol.com wrote: >> i'm confident the hundreds of lurkers here >> can tell -- without any difficulties at all -- >> who is being rational, solid, and thoughtful, >> and who is not. > > Of that, I have no doubt. ;) As somebody committed to trying to promote and expand the collaborative effort underway at DP and PG, I disagree. Traffic on this list, which ideally could be a useful resource for production and improvement of the community and workflow, has degenerated to bickering over trivialities. The usefulness of fully open "free" social systems is easily undermined, and this is prime good example. Anybody happening across this whole fiasco, or as far as I can see *any* contribution by bowerbird, would be quick to dismiss the entire community's effort as an amateurish and pointless waste of time as a result. Will you all please stop it and go away, if only to allow what little real discussion is required to proceed? Any subscriber who you might *want* to be your audience has you in a killfile, and everybody else reading this in archives or as a newcomer is being exposed to your infantile quibbling as a first taste of what PG workers are like. Nobody cares now, nor will they in the future, about this fiddling nonsense. If it's so damned important, go *do* it and stop asking for pats on the head or acknowledgment of your genius. Go settle this elsewhere, please. Is nobody administering this list? To what end, exactly? If it's merely been set up as flypaper for bowerbird, then please put that in the headers so we can at least let it simmer here out of public view. bowerbird and respondents: Just go write in one another's blog comments, please. Please! ----- Bill Tozier AIM: vaguery@mac.com blog: http://williamtozier.com/slurry plazes: http://beta.plazes.com/user/BillTozier skype: vaguery "Nature, however picturesque, never yet made a poet of a dullard." --Hjalmar Hjorth Boyesen From prosfilaes at gmail.com Fri Oct 20 05:18:18 2006 From: prosfilaes at gmail.com (David Starner) Date: Fri Oct 20 05:18:22 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <6d99d1fd0610200518x41627ac0r19a9ca01f54a8ec0@mail.gmail.com> On 10/19/06, Bowerbird@aol.com wrote: > they'll be decided by users out in the real world, > making decisions based on costs and benefits... Out in the real world, many groups that transcribe texts, like Oxford, use TEI. How many use ZML? From cannona at fireantproductions.com Fri Oct 20 07:57:38 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Oct 20 07:58:14 2006 Subject: [gutvol-d] what it all boils down to References: Message-ID: <003b01c6f458$2c62f520$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I wrote: > Get it? Alas, it appears he did not. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: To: ; Sent: Friday, October 20, 2006 2:54 AM Subject: re: [gutvol-d] what it all boils down to > when i "call" you, > you're supposed to > show me your cards, > not tell me a story > about a shark... > > i think you are now > _firmly_ on the record > that z.m.l. won't work... > > thanks for putting yourself > _firmly_ on the record, aaron. > time will tell... yes, time will tell... > > as it is, you exhausted my patience > -- which is a difficult thing to do! -- > so i won't likely be responding to you > again any time soon, just so you know. > > -bowerbird > - -------------------------------------------------------------------------------- > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOOQII7J99hVZuJcRArKjAKC1Q7ssBPqDJEIPkYlplXVjwjMqkgCg1yYG kzF/f5pAb0YUrjxahvjGGEM= =QbBH -----END PGP SIGNATURE----- From Bowerbird at aol.com Fri Oct 20 09:00:00 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 09:00:11 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: bill, the only time there is "needless bickering" on this list is when my detractors instigate it. as you might or might not be aware, they'd been quiet for quite a while, and it was peaceful here in the land of gutvol-d. it was only this week they came out and raised a big fuss again... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/d34051a2/attachment.html From Bowerbird at aol.com Fri Oct 20 09:14:13 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 09:14:28 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: david said: > Out in the real world, many groups that transcribe texts, > like Oxford, use TEI. How many use ZML? out in the real real world, most people use .pdf as their format. sad but true. it's true that some entities -- i would used university of virginia as my example, but oxford is fine too -- use .tei, that is true... (of course, you might want to notice that none of those entities has one-tenth the traction that project gutenberg has. why not?) anyway, some people use docbook, some roll their own format. there's certainly no abundance of experts in any of these esoteric formats, certainly none that have shown up _here_ to volunteer their expertise and ease the markup burden, which is why the .tei effort here is moving along so glacially. meanwhile, a new move to no-markup authoring is proceeding at an ever-increasing speed out in the "real real world" where "user-generated content" is among the buzzwords of the day. because there'd be no faster way to kill the buzz than to require plain ordinary people to deal with heavy markup to contribute... markdown has a big following, with a crew of developers in tow, and served as the model for the "crossmark" function that o.l.p.c. is developing. wiki-formatting is huge, even if we look no farther than wikipedia itself. and with the advent of wysiwyg authoring right in the webpage, with innovations like "writely" and the rest, the days of heavy markup that writers need to deal with is going to come to an end -- a very sudden end -- and that will be soon. so if you're counting on "installed base" as your best argument for heavy markup, you're putting yourself on very shaky ground. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/c8846902/attachment.html From Bowerbird at aol.com Fri Oct 20 09:19:11 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 09:19:23 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 Message-ID: <380.f7cf57f.326a50ff@aol.com> marcello said: > Oh! And this actually takes the file name from the command line instead > of hardcoding it into the program. How to extract the title out of the > file is left as an exercise for the 4th grader. > > #!/usr/bin/perl > # slurp whole file, mem is cheap > undef $/; > $text = <>; > print < > ? > ? ? > ? ? my antonia! > ? >? >? ? $text believe it or not, we're getting something constructive out of marcello! that's amazing! open-source really _is_ transformative, isn't it! :+) ok, sometimes -- when you make a script available for people to use -- the ability to get the filename from the command-line is good practice. so thank you, marcello, for this excellent example of how to do that... other times, when the script is sitting on your website and to be called, you'll want it to read the parameters as passed from the calling script. so, for instance, if we were to pass the filename starting in column 12: > read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); > $thefilename=substr($buffer,12); we're getting a little ahead of ourselves right now, it's best to take this one step at a time, but since marcello did good, we wanna reward him. a philosophical mantra of perl is "there is more than one way to do it", so don't hesitate to provide your input on any of these code examples. that's what the whole "many eyeballs" thing is all about, folks! -bowerbird p.s. marcello also weighed in with a good reply on the "split" i asked for, and we'll get to that in coming days, i promise. the message there was "sometimes the bug ain't where you think it is, so keep your mind open." but more on that later... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f62b63f2/attachment.html From Bowerbird at aol.com Fri Oct 20 09:19:23 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 09:19:40 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: brett, you've exhausted my patience as well. adieu. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/af5fe5d1/attachment.html From sam.bretheim at gmail.com Fri Oct 20 09:24:56 2006 From: sam.bretheim at gmail.com (Sam Bretheim) Date: Fri Oct 20 09:33:13 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate Message-ID: <4538F858.8020809@gmail.com> I propose that we create a new mailing list, perhaps called something like gutvol-alternatives-d, gutvol-debate-d, or gutvol-formats-d, in order to quarantine debate on "revolutionary" digitization approaches other than the standard PG and PGDP production processes. This proposal is not an attempt to crush dissent and innovation; rather, it is intended to decrease the signal-to-noise ratio on this forum, for the benefit of the many volunteers who have no interest in the matter. From desrod at gnu-designs.com Fri Oct 20 09:42:21 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Fri Oct 20 09:43:34 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <4538F858.8020809@gmail.com> References: <4538F858.8020809@gmail.com> Message-ID: <1161362541.6048.0.camel@localhost.localdomain> On Fri, 2006-10-20 at 10:24 -0600, Sam Bretheim wrote: > This proposal is not an attempt to crush dissent and innovation; > rather, it is intended to decrease the signal-to-noise ratio on this > forum, for the benefit of the many volunteers who have no interest in > the matter. I concur, great idea. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/e9b267d3/attachment-0001.bin From desrod at gnu-designs.com Fri Oct 20 09:47:46 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Fri Oct 20 09:48:33 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <1161362866.6048.3.camel@localhost.localdomain> On Fri, 2006-10-20 at 12:14 -0400, Bowerbird@aol.com wrote: > > Out in the real world, many groups that transcribe texts, > > like Oxford, use TEI. How many use ZML? > out in the real real world, most people use .pdf as their format. > sad but true. People don't author PDF files, they SaveAs/Export/Print to PDF format files, but their source material is in some other format... HTML, XML, Microsoft Word, OpenOffice.org and so on. I think you're confusing the two... There is no reason that I can see, why ZML, XML, TEI, TeX, foo, bar and blort formats can't all support the same final output, since the end users will never have to interact with the original source material that was used to produce them. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/0096532a/attachment.bin From Bowerbird at aol.com Fri Oct 20 09:50:16 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 09:50:35 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 Message-ID: keith said: > Do to its nature multi-line parsing or splitting is not quite that easy maybe. but we'll make it work. :+) > split is nice. But you want to be doing parsing on a personal note, any time i call what i'm doing "parsing", it goes badly. but as soon as i call it by _another_ name, the same thing with the same code, it starts working better. so i've grown allergic to that word, and i almost never use it. :+) however, i assume that you're talking about "parsing" in the "let's parse the dom tree" sense. (no, i don't even know what that means, so i might well have misused it, which would be poetic in its own way.) that kind of "parsing" would make our code very complicated. and in the same way that i don't like my format to be complex, i don't like my programs to be complex. so i make them simple. and what i'm showing people here is how much mileage can be obtained out of the simple combination of a simple format and some simple programs. that's the whole purpose of this exercise. so just stick with me for a little bit on "split", and see some tricks. (and, just to be clear, although you might think this is related to z.m.l., and thus can be swiftly relegated to the "i don't care" pile, the truth of the matter is that since virtually all of the books in the p.g. library have a plain-ascii representative, one that is close to z.m.l. format and perhaps even exact, the code that i'm showing here could also be used to great effect on the library as it stands. there are a lot of neat features that could be offered with very little work or trouble by using the simple code routines i'll reveal here. just as an example, how about a simple script that would give us a list of the section-headers for every book in the entire library? i don't know about you, but i think this "super table of contents" for the entire library would be very cool, and likely quite useful. and within a week or two of these little daily lessons, we'll have it.) > Just for the fun of it your script is incomplete. good observation, keith. now tell us why, and complete it... ;+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f8f526fe/attachment.html From joshua at hutchinson.net Fri Oct 20 09:51:43 2006 From: joshua at hutchinson.net (joshua@hutchinson.net) Date: Fri Oct 20 09:52:02 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate Message-ID: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net> That'd be great except for one thing. We tried creating a separate forum for this stuff before and the birdie boy ignored it and kept on posting his flamebait to gutvol-d. The only thing that ever works (outside outright banning him, which we've also done in the past) is for everyone to ignore him until he gets bored (but he always comes back again later). The problem is this: bowerbird is probably one of the most proficient and accomplished trolls I've run across in 15+ years on the Internet. He's definitely the only one I've ever promised to punch in the mouth if we ever meet in person! He will eventually rant idiocy about something someone truly cares about and then the flames will start all over again. Josh >----Original Message---- >From: desrod@gnu-designs.com >Date: Oct 20, 2006 12:42 >To: "Project Gutenberg Volunteer Discussion" >Subj: Re: [gutvol-d] Proposal: creation of new mailing list for PG- related format and process debate > >On Fri, 2006-10-20 at 10:24 -0600, Sam Bretheim wrote: >> This proposal is not an attempt to crush dissent and innovation; >> rather, it is intended to decrease the signal-to-noise ratio on this >> forum, for the benefit of the many volunteers who have no interest in >> the matter. > >I concur, great idea. > >-- >David A. Desrosiers >desrod@gnu-designs.com >http://gnu-designs.com > >"Erosion of civil liberties... is a threat to national security." >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > From bill at williamtozier.com Fri Oct 20 09:53:53 2006 From: bill at williamtozier.com (William Tozier) Date: Fri Oct 20 09:54:05 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <596C2A52-74DC-4F7C-B898-816FF9D27AA5@williamtozier.com> On Oct 20, 2006, at 12:00 PM, Bowerbird@aol.com wrote: > bill, the only time there is "needless bickering" on this list > is when my detractors instigate it. as you might or might not > be aware, they'd been quiet for quite a while, and it was peaceful > here in the land of gutvol-d. it was only this week they came out > and raised a big fuss again... Then it is the better part of valor to ignore them. Clearly, if the only traffic is people picking on you, and the best course to undermine trolls' behavior is to ignore them, then you are best served by ignoring your detractors. Giving in to the impulse to correct their misstatements immediately is exactly what they're looking for. Any reasonable person will be able to read the archives and see that they are simply throwing sticks and stones, and that (if you cease posting in response) you're the only one providing actual useful content. Can't lose by staying quiet. Give it a shot. ----- Bill Tozier AIM: vaguery@mac.com blog: http://williamtozier.com/slurry plazes: http://beta.plazes.com/user/BillTozier skype: vaguery "Nature, however picturesque, never yet made a poet of a dullard." --Hjalmar Hjorth Boyesen From bill at williamtozier.com Fri Oct 20 09:54:21 2006 From: bill at williamtozier.com (William Tozier) Date: Fri Oct 20 09:54:26 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <4538F858.8020809@gmail.com> References: <4538F858.8020809@gmail.com> Message-ID: <70F41ED1-6EC2-4A83-AA1A-F5D416BC4C7A@williamtozier.com> On Oct 20, 2006, at 12:24 PM, Sam Bretheim wrote: > I propose that we create a new mailing list, perhaps called > something like gutvol-alternatives-d, gutvol-debate-d, or gutvol- > formats-d, in order to quarantine debate on "revolutionary" > digitization approaches other than the standard PG and PGDP > production processes. This proposal is not an attempt to crush > dissent and innovation; rather, it is intended to decrease the > signal-to-noise ratio on this forum, for the benefit of the many > volunteers who have no interest in the matter. God yes. ----- Bill Tozier AIM: vaguery@mac.com blog: http://williamtozier.com/slurry plazes: http://beta.plazes.com/user/BillTozier skype: vaguery "Nature, however picturesque, never yet made a poet of a dullard." --Hjalmar Hjorth Boyesen From Bowerbird at aol.com Fri Oct 20 10:14:31 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 10:14:40 2006 Subject: [gutvol-d] re: to punch in the mouth Message-ID: josh said: > He's definitely the only one I've ever promised > to punch in the mouth if we ever meet in person!? are you serious? i mean, really? because that's _funny_. anyway, i can't promise that i won't punch you right back, but violence is _so_ 20th-century... much better to just have you thrown in jail, i guess. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/e84f39fc/attachment.html From marcello at perathoner.de Fri Oct 20 10:24:25 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 10:24:29 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <45390649.7080609@perathoner.de> Bowerbird@aol.com wrote: > wiki-formatting is huge, even if we look no farther > than wikipedia itself. Then why do you invent a new format that is inferior to wiki? Everybody already knows wiki, so go with that. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Fri Oct 20 10:26:15 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 10:26:18 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: <380.f7cf57f.326a50ff@aol.com> References: <380.f7cf57f.326a50ff@aol.com> Message-ID: <453906B7.6070700@perathoner.de> Bowerbird@aol.com wrote: > believe it or not, we're getting something constructive out of marcello! Now we just need to figure out how to get the same out of you! -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Fri Oct 20 10:34:03 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 10:34:10 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: <380.f7cf57f.326a50ff@aol.com> References: <380.f7cf57f.326a50ff@aol.com> Message-ID: <4539088B.2070009@perathoner.de> Bowerbird@aol.com wrote: > other times, when the script is sitting on your website and to be called, > you'll want it to read the parameters as passed from the calling script. > so, for instance, if we were to pass the filename starting in column 12: >> read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); >> $thefilename=substr($buffer,12); You are just cutting and pasting out of some perl cgi tutorial. You don't have the least idea what is going on. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Fri Oct 20 10:35:05 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 10:35:25 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: william said: > Can't lose by staying quiet. Give it a shot. if by "staying quiet", you mean "stop responding to trolls", well, i've done just that, today, with both aaron and brett... and josh -- with a _repeat_ of a promise to "punch" me -- has probably earned himself that distinction now as well... and after i complimented a post of his, and replied to it _in_detail_ (which got absolutely no response from him), sam is now suggesting that i be exiled off to another list. um, i guess the message is that if you can't answer a critic, you should have him silenced. that seems to be in vogue with our president these days, but here on gutvol-d too? so yeah, in some ways it looks pretty bleak here. on the other hand, marcello, who has had a signal-to-noise ratio of about 3/997 in the past, came through with _two_ (2) constructive posts today. (oh sure, they had trollish language, but nonetheless they were _constructive_ in the sense that they were on-point and added a relevant point. what more can i ask?) and in spite of his occasional resort into ad hominem land, david consistently drags in some good arguments as well... and keith is kicking in some good stuff too. so i don't think gutvol-d is the wasteland you've described. would it be better if people were civil to me? undoubtedly. because then i could continue to be my normal civil self... and would it be better if people who can't be civil to me would simply not respond to me? absolutely. i loved the peace and quiet that has been the normal mode around here for months. in fact, i _encourage_ those people who cannot resist responding to me to put me in their kill-files and not even _read_ my posts; i'm not here for the conflict. frankly, i think conflict is _stupid_... *** on the other hand, if you mean that i should stop speaking, simply because some people occasionally jump all over me, let me just say that that's not likely to happen, bill. not at all. of course, if the president happens to pull me off the street as "a suspected terrorist", then you'll stop hearing from me. (i can see josh reaching for the phone now to call the c.i.a.) but barring that, i'll post here regularly, with relevant thoughts. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/c062df45/attachment-0001.html From marcello at perathoner.de Fri Oct 20 10:36:08 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 10:36:12 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <4538F858.8020809@gmail.com> References: <4538F858.8020809@gmail.com> Message-ID: <45390908.4020705@perathoner.de> Sam Bretheim wrote: > I propose that we create a new mailing list, perhaps called something > like gutvol-alternatives-d, gutvol-debate-d, or gutvol-formats-d, in > order to quarantine debate on "revolutionary" digitization approaches > other than the standard PG and PGDP production processes. This proposal > is not an attempt to crush dissent and innovation; rather, it is > intended to decrease the signal-to-noise ratio on this forum, for the > benefit of the many volunteers who have no interest in the matter. We already have that: gutvol-p Bowerbird got moderated on gutvol-p so he took over this one. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Fri Oct 20 11:00:32 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 11:00:42 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: david said: > People don't author PDF files, they SaveAs/Export/Print to PDF format files, > but their source material is in some other format... HTML, XML, > Microsoft Word, OpenOffice.org and so on. the point is, the thing they distribute to their readers is a .pdf. (and hey, david, i don't like that fact any better than you do.) > I think you're confusing the two... no, i'm clear. when most entities out there in the real world make a file available to other entities, they do it using a .pdf. your distinction between "authoring" and "save/export/print to pdf" is an arbitrary one. we could make the same point about .html, with some people composing it in dreamweaver, others in notepad, others in microsoft word or openoffice, and so on. but the point is that they're using .html as the vehicle. (and in most cases, they will put .html material on the web itself, rather than distribute it as files. given _that_ view of things, then .html is a bigger vehicle than .pdf.) > There is no reason that I can see, > why ZML, XML, TEI, TeX, foo, bar and blort formats > can't all support the same final output wow. did you really mean to include .zml in that list? i mean, i agree with the statement as you wrote it, but it would be a major concession for you to say that .zml can do everything .xml can do... or do i misunderstand? > since the end users will never have to interact with > the original source material that was used to produce them. that is one view of things, that the "master version" is one that is never shared with users, that all they receive is derivative versions. personally, i'd rather _empower_ users by giving them the "master". and i'd like to further empower users by giving them conversion tools, so they could generate all the derivative versions themselves, without any need for ever having to consult with me at any time down the line. (people seem to _expect_ that the web "will always be there", but i know that an evil president could shut the thing down in the blink of an eye, and i am _not_ so naive as to believe we'll never have an evil president. due to this, i want to distribute the books in our library far and wide.) even _more_, i'd like to empower users by giving them authoring and viewing tools that work directly on the "master version", so they would have no need to have to ever bother with generating derivative formats. (part and parcel of this is giving them a format that's easy to understand.) this independence is (for me) the only long-range plan worth supporting. i'm guessing that if i _can_ empower users in these ways, that they will come to me, instead of going to someone that wants to be their master by hoarding the "master version". you might see it differently. viva la... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/bc3beb4a/attachment.html From lee at novomail.net Fri Oct 20 11:04:19 2006 From: lee at novomail.net (Lee Passey) Date: Fri Oct 20 11:02:49 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: References: Message-ID: <45390FA3.4050502@novomail.net> Schultz Keith J. wrote: > Hi There, > > There are already such tools availible commercially!! > > It has been around for a long time TeX and LaTeX. Textures is a > WYSIWYG system and authouring Tool. > > LaTeX can be easily converted to pdf, html, xml, docbook, etc. > > As Bowerbird mentioned in another thread why reinvent the wheel or > try to. > > Just my two Euro cents worth! > > regards > Keith. > > P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics, > footnotes, layout control, indices, bibliographie, multi-language, > right-left, left-right., ASCII, UniCode, pratically platform > independent. There are freeware versions, but they are generally not > WYSIWYG, thereby having at first a stiff learning curve. > > Keith. I certainly agree that we should adopt as much from existing formats/software as possible. If you look at Mr. Noring's original call to action that has Bowerbird so agitated (http://groups.yahoo.com/group/ebook-community/message/26923) you will see that it is, at this point, more an attempt to gather requirements than to specify a design. Now LaTeX is advertised as a "typesetting language," so I had originally dismissed it as a presentational language, whereas what we are looking for is a way to markup a document structure without specifying the presentation. Prompted by your message, I went to the Internet and looked at LaTeX a little more closely. I discovered that I was wrong. LaTeX is document structure markup, not document presentation markup, apparently almost as powerful as TEI, and probably more powerful than XHTML. I think it could work as a master format for an authoring tool. However, in the spirit of not re-inventing wheels, I would like to reuse not only an existing format, but also existing code and tools as well. TeK isn't usefully in this context because it is a printer driver; we're not interested in printing, we're interested in conversion from the master format to multiple e-book formats. Are there other tools or available code which we could re-use? My current bias is to start with a subset of TEI because 1. it is well-understood and well-established, and 2. there are lots of tools and available code implementations to manipulate XML files. Nonetheless, I could be persuaded to go with LaTeX, and would love to here the arguments in its favor. I do note that this discussion is somewhat tangential to the mission of Project Gutenberg, which is to make available simplified text versions of public domain works, so I would invite everyone interested to continue the conversation on the ebook-community discussion list. -- Nothing of significance below this line. From Bowerbird at aol.com Fri Oct 20 11:09:00 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 11:09:08 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day Message-ID: marcello is trying to overwhelm your e-mailbox with posts, so everyone gives up and stops reading all of these threads. it's a crude technique, but it does work. thus, to counter it, i've responded to several of his messages in this one reply. *** marcello said: > Then why do you invent a new format that is inferior to wiki? because i like to tinker. besides, z.m.l. is _not_ "inferior" to wiki. it might not be "superior" either, but it's definitely not "inferior". it's just a different take on the same general idea. since it is a rather new idea, it's good to experiment with many approaches. > Everybody already knows wiki, so go with that. i'm not the type of person who "goes" with "everybody". but thanks for the suggestion. > Now we just need to figure out how to get the same out of you! except you are playing in the sandbox of my thread. let's see _you_ start a thread that people care about. 95% of your posts over the last 3 years have been a _direct_ reply to a point of _mine_. it's as if you have no life at all except the one that i give to you. i'm the rain, and you're the poisonous mushrooms. c'mon, man, develop a _spine_, for crying out loud... > You are just cutting and pasting out of some perl cgi tutorial. is that a bad thing? that's how most programmers start learning. and yep, i'm just a beginner with perl. yet i'm going to show you how much power can be realized by even a beginner like me -- given a nice, simple format like .zml -- with nothing but a rudimentary knowledge of a dozen or so concepts and commands... that's the whole point of this little exercise... > You don't have the least idea what is going on. i don't think it's too hard to figure those lines out. would you like me to explain them to you and then you can tell me if i got it right?, because i can do that. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/217103eb/attachment.html From desrod at gnu-designs.com Fri Oct 20 11:12:32 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Fri Oct 20 11:13:37 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <1161367952.6048.5.camel@localhost.localdomain> On Fri, 2006-10-20 at 14:09 -0400, Bowerbird@aol.com wrote: > thus, to counter it, i've responded to several of his messages in this > one reply. But you've retained the Reply-To and MessageID headers from only one thread, so this reply will get buried deep inside only one thread, so the context in the other threads will be lost. That is the whole point of threading... and now you've co-opted it for reasons which I cannot seem to understand. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/02b7f7d0/attachment.bin From lee at novomail.net Fri Oct 20 11:19:28 2006 From: lee at novomail.net (Lee Passey) Date: Fri Oct 20 11:17:57 2006 Subject: [gutvol-d] what it all boils down to In-Reply-To: References: Message-ID: <45391330.4030702@novomail.net> (I'm probably going to regret this, but ...) Bowerbird@aol.com wrote: > or "indent lines of the poem however much you want them to > be indented, but use at least one space of indentation so we > know that it's a _block_ and that it shouldn't be re-wrapped"... Just out of idle curiosity, in ZML how do you mark up a block quotation, that is, one or more full or partial paragraphs quoted from another source, which are typically block offset but which should nevertheless be word-wrapped? -- Nothing of significance below this line. From marcello at perathoner.de Fri Oct 20 11:18:42 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 11:18:46 2006 Subject: [gutvol-d] pulled back from the brink to live yet another day In-Reply-To: References: Message-ID: <45391302.8040909@perathoner.de> Bowerbird@aol.com wrote: >> You don't have the least idea what is going on. > > i don't think it's too hard to figure those lines out. > would you like me to explain them to you and then > you can tell me if i got it right?, because i can do that. Yes. Please explain. Make my day. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Fri Oct 20 11:19:36 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 11:19:48 2006 Subject: [gutvol-d] meanwhile, i'm really excited! Message-ID: lee said: > Prompted by your message, I went to the Internet and > looked at LaTeX a little more closely. I discovered that I was wrong. see, william, here's another good thing. lee has discovered latex, and it happened because he was involved in a thread right here... > Nonetheless, I could be persuaded to go with LaTeX, > and would love to here the arguments in its favor. hear here! > I do note that this discussion is somewhat tangential to > the mission of Project Gutenberg, which is to make > available simplified text versions of public domain works founder michael hart doesn't define the mission that narrowly. he's in favor of whatever formats -- in addition to plain text -- people want, so a multi-format converter is right on-topic here. indeed, that is the impetus for the whole movement to .tei, or at least it was originally, that it could generate multiple formats. > so I would invite everyone interested to continue the > conversation on the ebook-community discussion list. for all the people who want to get away from bowerbird, jon noring's listserve is _the_ place to do that, yes sir! :+) yep, jon banned me from there many many years ago... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/fd1764d4/attachment-0001.html From jon at noring.name Fri Oct 20 11:24:26 2006 From: jon at noring.name (Jon Noring) Date: Fri Oct 20 11:31:27 2006 Subject: [gutvol-d] meanwhile, i'm really excited! In-Reply-To: <45390FA3.4050502@novomail.net> References: <45390FA3.4050502@novomail.net> Message-ID: <891243459.20061020122426@noring.name> Lee wrote: > I certainly agree that we should adopt as much from existing > formats/software as possible. If you look at Mr. Noring's original call > to action that has Bowerbird so agitated > (http://groups.yahoo.com/group/ebook-community/message/26923) you will > see that it is, at this point, more an attempt to gather requirements > than to specify a design. Yes, my original TeBC message was essentially a requirements gathering process. Also note that I did not post the call for requirements to gutvol-* since I deemed it to be mostly off-topic to gutvol. It was Bowerbird who dragged that into gutvol-* and from there the fun began. Jon Noring From Bowerbird at aol.com Fri Oct 20 11:40:16 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 11:40:23 2006 Subject: [gutvol-d] what it all boils down to Message-ID: lee said: > Just out of idle curiosity, in ZML how do you mark up a block quotation, > that is, one or more full or partial paragraphs quoted from another source, > which are typically block offset but which should nevertheless be word-wrapped? if you have questions about the rules of .zml, you should go review them: > http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt (especially on days like the last week, when message traffic is so heavy here.) if you still have questions after reading the rules, then i'll be happy to help. but i suspect that this isn't merely "idle curiosity". :+) my guess is that you're trying to ask when word-wrapping _will_ occur, and when it will not. or, put another way, how can an author specify that word-wrapping _should_ be done, as opposed to when it should _not_... if that's what you _really_ want to know, lee, then ask it directly, ok?, and i'll be happy to tell you, assuming the answer is not clear to you once you've read the rules. (i honestly can't remember if it's clear in the version of the rules that is posted, which might be outdated now, so maybe you can tell me that. i might well have considered the case where rewrapping is _wanted_ to be "too advanced" for the rules then; and part of the reason for +that_ is that "rewrapping" might not mean the same thing in your mind as it means in the display-world of .zml... but that's getting _far_ too removed from any relevance to gutvol-d.) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/221ad601/attachment.html From marcello at perathoner.de Fri Oct 20 11:49:22 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Oct 20 11:49:25 2006 Subject: [gutvol-d] what it all boils down to In-Reply-To: References: Message-ID: <45391A32.1050404@perathoner.de> Bowerbird@aol.com wrote: > if that's what you _really_ want to know, lee, then ask it directly, ok?, > and i'll be happy to tell you, assuming the answer is not clear to you > once you've read the rules. (i honestly can't remember if it's clear in > the version of the rules that is posted, which might be outdated now, > so maybe you can tell me that. i might well have considered the case > where rewrapping is _wanted_ to be "too advanced" for the rules then; > and part of the reason for +that_ is that "rewrapping" might not mean > the same thing in your mind as it means in the display-world of .zml... > but that's getting _far_ too removed from any relevance to gutvol-d.) That's a long-winded way to say: blockquotes are not supported im ZML. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Fri Oct 20 12:13:43 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 20 12:13:50 2006 Subject: [gutvol-d] a new policy -- one post per day! Message-ID: <305.472c68d1.326a79e7@aol.com> beginning on monday of next week, i will adopt a new policy, and post just one message per day. on some days, it'll be a very _long_ post, to be sure, if lotsa people have pitched flak at me the day before. nonetheless, i will still make just _one_ post per day, so as to sidestep the overflow-mailbox strategy that some of my detractors are forcing down our throats. so let's get started, eh? this is my last post for today. -bowerbird p.s. and yes, we'll still do the open-source project! i'll just work it into my one-post-per-day regime... have a nice weekend! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/8206aa24/attachment.html From lee at novomail.net Fri Oct 20 12:37:05 2006 From: lee at novomail.net (Lee Passey) Date: Fri Oct 20 12:35:37 2006 Subject: [gutvol-d] what it all boils down to In-Reply-To: References: Message-ID: <45392561.1010709@novomail.net> Bowerbird@aol.com wrote: > lee said: >> Just out of idle curiosity, in ZML how do you mark up a block >> quotation, that is, one or more full or partial paragraphs quoted >> from another source, which are typically block offset but which >> should nevertheless be word-wrapped? > > if you have questions about the rules of .zml, you should go review > them: > http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt > (especially on days like the last week, when message traffic is so > heavy here.) > > if you still have questions after reading the rules, then i'll be > happy to help. > > but i suspect that this isn't merely "idle curiosity". :+) > > my guess is that you're trying to ask when word-wrapping _will_ > occur, and when it will not. or, put another way, how can an author > specify that word-wrapping _should_ be done, as opposed to when it > should _not_.. Actually, what I'm really trying to ask is, in ZML how do you mark up a block quotation? The only mention of blocks in your ZML description file is in ruleset 4, where you suggest that no "line" which begins with a whitespace will be wrapped (I simply assume that, absent indications to the contrary, all other text will be wrapped). Maybe you could make a /new/ rule that block quotes are a collection of lines that begin with right angle bracket, as above? Then your ZML viewer program could detect that, remove the markup, and display the block quote according to the user's preferences. -- Nothing of significance below this line. From cannona at fireantproductions.com Fri Oct 20 14:52:30 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Oct 20 14:54:46 2006 Subject: [gutvol-d] meanwhile, i'm really excited! References: <45390FA3.4050502@novomail.net> Message-ID: <010601c6f492$5c7b39e0$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 One great thing about latex is that it is very very widely used. I dare say that the only mark up language that is more widely known is html and possibly WIKI. At least, that's my guess based on my university experience. It seems that virtually all of the graduate level math, science, engineering, statistics, and actuarial students know and use it to one degree or another. Also, it has wonderfully comprehensive support for math and science equations. On the other hand, it might be easier to parse xml. Also, as mediawiki has shown, it is quite easy to take the best latex has to offer (I.E. it's ability to represent complex equations) and add that to nearly any other format. Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Lee Passey" To: "Project Gutenberg Volunteer Discussion" Sent: Friday, October 20, 2006 1:04 PM Subject: Re: [gutvol-d] meanwhile, i'm really excited! > Schultz Keith J. wrote: > >> Hi There, >> >> There are already such tools availible commercially!! >> >> It has been around for a long time TeX and LaTeX. Textures is a >> WYSIWYG system and authouring Tool. >> >> LaTeX can be easily converted to pdf, html, xml, docbook, etc. >> >> As Bowerbird mentioned in another thread why reinvent the wheel or >> try to. >> >> Just my two Euro cents worth! >> >> regards > > Keith. >> >> P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics, >> footnotes, layout control, indices, bibliographie, multi-language, >> right-left, left-right., ASCII, UniCode, pratically platform >> independent. There are freeware versions, but they are generally not >> WYSIWYG, thereby having at first a stiff learning curve. >> >> Keith. > > I certainly agree that we should adopt as much from existing > formats/software as possible. If you look at Mr. Noring's original call to > action that has Bowerbird so agitated > (http://groups.yahoo.com/group/ebook-community/message/26923) you will see > that it is, at this point, more an attempt to gather requirements than to > specify a design. > > Now LaTeX is advertised as a "typesetting language," so I had originally > dismissed it as a presentational language, whereas what we are looking for > is a way to markup a document structure without specifying the > presentation. Prompted by your message, I went to the Internet and looked > at LaTeX a little more closely. I discovered that I was wrong. > > LaTeX is document structure markup, not document presentation markup, > apparently almost as powerful as TEI, and probably more powerful than > XHTML. I think it could work as a master format for an authoring tool. > > However, in the spirit of not re-inventing wheels, I would like to reuse > not only an existing format, but also existing code and tools as well. TeK > isn't usefully in this context because it is a printer driver; we're not > interested in printing, we're interested in conversion from the master > format to multiple e-book formats. Are there other tools or available code > which we could re-use? My current bias is to start with a subset of TEI > because 1. it is well-understood and well-established, and 2. there are > lots of tools and available code implementations to manipulate XML files. > Nonetheless, I could be persuaded to go with LaTeX, and would love to here > the arguments in its favor. > > I do note that this discussion is somewhat tangential to the mission of > Project Gutenberg, which is to make available simplified text versions of > public domain works, so I would invite everyone interested to continue the > conversation on the ebook-community discussion list. > > -- > Nothing of significance below this line. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOUWoI7J99hVZuJcRAv1nAKDQPXQs3qSMNUtN+xUEbOXJyuDcpACePVcY H2LydfP+xbkj5/5oDMREBTA= =xvGc -----END PGP SIGNATURE----- From scott_bulkmail at productarchitect.com Fri Oct 20 21:11:29 2006 From: scott_bulkmail at productarchitect.com (Scott Lawton) Date: Sat Oct 21 00:44:33 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net> References: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net> Message-ID: >The problem is this: bowerbird is probably one of the most proficient >and accomplished trolls I've run across in 15+ years on the Internet. And, that Greg is too nice to ban him. Or create gutvol-bb and ban him from every other gut list. And, as you noted, that lots of people who should (IMHO) know better continue to reply to him. I periodically check the [folder name censored] where my filter dumps his posts; I've yet to encounter a reason to disable the filter. -- Cheers, Scott S. Lawton http://Classicosm.com/ - classic books From cannona at fireantproductions.com Sat Oct 21 04:25:16 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sat Oct 21 04:25:32 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate References: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net> Message-ID: <001d01c6f503$9a85e110$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 If I am not mistaken, Greg unbanned him because of a mandate, not by choice. Or at least, that is what I recall he said on the matter when asked, and I have no reason to doubt him. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Scott Lawton" To: Sent: Friday, October 20, 2006 11:11 PM Subject: Re: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate > >The problem is this: bowerbird is probably one of the most proficient >>and accomplished trolls I've run across in 15+ years on the Internet. > > And, that Greg is too nice to ban him. Or create gutvol-bb and ban him > from every other gut list. > > And, as you noted, that lots of people who should (IMHO) know better > continue to reply to him. I periodically check the [folder name censored] > where my filter dumps his posts; I've yet to encounter a reason to disable > the filter. > -- > > Cheers, > > Scott S. Lawton > http://Classicosm.com/ - classic books > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFOgOlI7J99hVZuJcRAvdXAJ9MPfw4/bs1samDYFsa7yAoCwl6agCgvB6V A23mtZrAONIR14MHRlXBeBA= =dbfD -----END PGP SIGNATURE----- From gbnewby at pglaf.org Sat Oct 21 23:49:08 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Sat Oct 21 23:49:10 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <001d01c6f503$9a85e110$0300a8c0@blackbox> Message-ID: <20061022064908.GA6749@pglaf.org> On Sat, Oct 21, 2006 at 06:25:16AM -0500, Aaron Cannon wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > If I am not mistaken, Greg unbanned him because of a mandate, not by choice. > Or at least, that is what I recall he said on the matter when asked, and I > have no reason to doubt him. > > Sincerely > Aaron Cannon BB was never banned, he was simply moderated. When we moved to new mailing list software, I unmoderated him. At the time of moderation, I listened to community pressure, and followed through after making specific requests to BB for changes in behavior. However, ultimately individualized moderation is just not consistent with the overall way PG operates (see the "About" essays Michael and I worked on last year at www.gutenberg.org). Sidenote: Chuck Mattsen has handled moderation for posted and pgww for awhile, and needs to give up this duty. I could use a few volunteers to handle moderation of those lists, which have just a few subscribers but allow posting (after a moderation decision) by anyone. We also get a lot of spam to the other lists, including -d, glibrary and the newsletter lists, requiring moderation action. Basically this involves about 10 instances per day of spending a few seconds with the Mailman web-based interface. Email me if you monitor your email very regularly, and might be able to help with moderation. Back to topic: I encourage people to take control of their own mailboxes. If you don't like reading postings from someone, filter them. If you don't know how to filter people in the email program you use, ask here and we'll help. Many email programs can filter entire threads (by Subject line), but filtering an individual's email is even easier. I would prefer that Mailman (which otherwise is a capable and wonderful mailing list manager) offered subscribers the option to elect not to see messages from particular addresses, but that's not a currently available feature. Scott's recommendations, below, make a lot of sense to me. -- Greg > - -- > Skype: cannona > MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail > address.) > - ----- Original Message ----- > From: "Scott Lawton" > To: > Sent: Friday, October 20, 2006 11:11 PM > Subject: Re: [gutvol-d] Proposal: creation of new mailing list for > PG-related format and process debate > > > >>The problem is this: bowerbird is probably one of the most proficient > >>and accomplished trolls I've run across in 15+ years on the Internet. > > > >And, that Greg is too nice to ban him. Or create gutvol-bb and ban him > >from every other gut list. > > > >And, as you noted, that lots of people who should (IMHO) know better > >continue to reply to him. I periodically check the [folder name censored] > >where my filter dumps his posts; I've yet to encounter a reason to disable > >the filter. > >-- > > > >Cheers, > > > >Scott S. Lawton > >http://Classicosm.com/ - classic books From hyphen at hyphenologist.co.uk Sun Oct 22 01:40:38 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Sun Oct 22 01:40:52 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <20061022064908.GA6749@pglaf.org> References: <001d01c6f503$9a85e110$0300a8c0@blackbox> <20061022064908.GA6749@pglaf.org> Message-ID: On Sat, 21 Oct 2006 23:49:08 -0700, Greg Newby wrote: |I encourage people to take control of their own mailboxes. If you don't |like reading postings from someone, filter them. If you don't know how |to filter people in the email program you use, ask here and we'll help. |Many email programs can filter entire threads (by Subject line), but |filtering an individual's email is even easier. Agreed |I would prefer that Mailman (which otherwise is a capable and wonderful |mailing list manager) offered subscribers the option to elect not to see |messages from particular addresses, but that's not a currently available |feature. Scott's recommendations, below, make a lot of sense to me. Agent 4 has introduced a fabulous Bayesian filtering system. IMO worth every penny I spent on it Just drag BBs posts to the junk folder and they will always end up there. Alternatively just set up a filter to delete BB's posts. -- Dave Fawthrop For Yorkshire Dialect http://www.gutenberg.org/author/John_Hartley http://www.gutenberg.org/author/F_W_Moorman 19,000 free e-books at Project Gutenberg! http://www.gutenberg.org From klofstrom at gmail.com Sun Oct 22 01:44:38 2006 From: klofstrom at gmail.com (Karen Lofstrom) Date: Sun Oct 22 01:44:41 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: References: <001d01c6f503$9a85e110$0300a8c0@blackbox> <20061022064908.GA6749@pglaf.org> Message-ID: <1e8e65080610220144w22f6dafakead2785217d0d12a@mail.gmail.com> Thanks for reminder about the filter. I switched to gmail a few months ago and had never used their filter option. First time for everything. Extremely easy to use. Bye-bye BB. -- Karen Lofstrom From desrod at gnu-designs.com Sun Oct 22 05:23:43 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Sun Oct 22 05:24:54 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: <20061022064908.GA6749@pglaf.org> References: <20061022064908.GA6749@pglaf.org> Message-ID: > I would prefer that Mailman (which otherwise is a capable and > wonderful mailing list manager) offered subscribers the option to > elect not to see messages from particular addresses, but that's not > a currently available feature. Wouldn't this ultimately expose email addresses to other users on the list? You'd have to have some way to select the user you wanted to filter/ignore by, and if that user never put in a "real name" when they subscribed to Mailman, the only other identifying information would be their email address. I suspect it wouldn't be a feature in Mailman any time soon more for ethical reasons than technical ones. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From davedoty at hotmail.com Sun Oct 22 09:33:49 2006 From: davedoty at hotmail.com (Dave Doty) Date: Sun Oct 22 09:33:51 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate Message-ID: > From: gbnewby@pglaf.org > I encourage people to take control of their own mailboxes. If you don't > like reading postings from someone, filter them. The problem is the high number of people who seem to enjoy arguing with BB. Even though I banned him years ago, it's still not uncommon that I open the mailbox and find it stuffed full of e-mail quoting him in full and following with extended arguing. The problem isn't even BB himself, but that this has become the BB Forum, and that debating him seems to take up more space than everything else PG-related. Other than banning half the list, most of whom have things worth saying in other contexts, how can I take control of my mailbox to deal with this? Or is it a case of "put up with it or leave?" _________________________________________________________________ Get the new Windows Live Messenger! http://get.live.com/messenger/overview From hyphen at hyphenologist.co.uk Sun Oct 22 10:34:54 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Sun Oct 22 10:35:05 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: References: Message-ID: On Sun, 22 Oct 2006 16:33:49 +0000, Dave Doty wrote: | |> From: gbnewby@pglaf.org | |> I encourage people to take control of their own mailboxes. If you don't |> like reading postings from someone, filter them. | |The problem is the high number of people who seem to enjoy arguing with BB. |Even though I banned him years ago, it's still not uncommon that I open the |mailbox and find it stuffed full of e-mail quoting him in full and following |with extended arguing. The problem isn't even BB himself, but that this has |become the BB Forum, and that debating him seems to take up more space than |everything else PG-related. Other than banning half the list, most of whom |have things worth saying in other contexts, how can I take control of my |mailbox to deal with this? Or is it a case of "put up with it or leave?" Agent 4 or was it 3? allows you to ignore Sub Thread, you can get rid of replies to BBs posts and anything downthread. -- Dave Fawthrop For Yorkshire Dialect http://www.gutenberg.org/author/John_Hartley http://www.gutenberg.org/author/F_W_Moorman 19,000 free e-books at Project Gutenberg! http://www.gutenberg.org From desrod at gnu-designs.com Sun Oct 22 10:55:32 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Sun Oct 22 10:57:53 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: References: Message-ID: <1161539732.10407.4.camel@localhost.localdomain> On Sun, 2006-10-22 at 16:33 +0000, Dave Doty wrote: > The problem isn't even BB himself, but that this has become the BB > Forum, and that debating him seems to take up more space than > everything else PG-related. I think you've hit the nail on the head. He's co-opted the list, so everyone has to either respond to him, or keep quiet. If you notice, he doesn't even let the smallest mention of his name slip, without a personal reply. When he's backed into a corner, he lays blame elsewhere by pointing fingers to someone else and goes on making splinter threads to keep everyone off-track, following his fake leads. He quite literally *CANNOT* stop replying or ignore someone who mentions his name or replies to a thread that he has ever responded to. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/4bee0377/attachment.bin From cannona at fireantproductions.com Sun Oct 22 11:04:19 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Oct 22 11:06:11 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-relatedformat and process debate References: <20061022064908.GA6749@pglaf.org> Message-ID: <003b01c6f604$bedec390$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sorry to have misstated the situation. I must have either misunderstood your message on the topic or simply remembered wrong. Either way, my bad. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Greg Newby" To: "Project Gutenberg Volunteer Discussion" Sent: Sunday, October 22, 2006 1:49 AM Subject: Re: [gutvol-d] Proposal: creation of new mailing list for PG-relatedformat and process debate > On Sat, Oct 21, 2006 at 06:25:16AM -0500, Aaron Cannon wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> If I am not mistaken, Greg unbanned him because of a mandate, not by >> choice. >> Or at least, that is what I recall he said on the matter when asked, and >> I >> have no reason to doubt him. >> >> Sincerely >> Aaron Cannon > > BB was never banned, he was simply moderated. When we moved to new > mailing list software, I unmoderated him. At the time of moderation, I > listened to community pressure, and followed through after making > specific requests to BB for changes in behavior. However, ultimately > individualized moderation is just not consistent with the overall way PG > operates (see the "About" essays Michael and I worked on last year at > www.gutenberg.org). > > Sidenote: > Chuck Mattsen has handled moderation for posted and pgww for > awhile, and needs to give up this duty. I could use a few volunteers to > handle moderation of those lists, which have just a few subscribers but > allow posting (after a moderation decision) by anyone. We also get a > lot of spam to the other lists, including -d, glibrary and the > newsletter lists, requiring moderation action. Basically this involves > about 10 instances per day of spending a few seconds with the Mailman > web-based interface. Email me if you monitor your email very > regularly, and might be able to help with moderation. > > Back to topic: > I encourage people to take control of their own mailboxes. If you don't > like reading postings from someone, filter them. If you don't know how > to filter people in the email program you use, ask here and we'll help. > Many email programs can filter entire threads (by Subject line), but > filtering an individual's email is even easier. > > I would prefer that Mailman (which otherwise is a capable and wonderful > mailing list manager) offered subscribers the option to elect not to see > messages from particular addresses, but that's not a currently available > feature. Scott's recommendations, below, make a lot of sense to me. > -- Greg > >> - -- >> Skype: cannona >> MSN/Windows Messenger: cannona@hotmail.com (don't send email to the >> hotmail >> address.) >> - ----- Original Message ----- >> From: "Scott Lawton" >> To: >> Sent: Friday, October 20, 2006 11:11 PM >> Subject: Re: [gutvol-d] Proposal: creation of new mailing list for >> PG-related format and process debate >> >> >> >>The problem is this: bowerbird is probably one of the most proficient >> >>and accomplished trolls I've run across in 15+ years on the Internet. >> > >> >And, that Greg is too nice to ban him. Or create gutvol-bb and ban him >> >from every other gut list. >> > >> >And, as you noted, that lots of people who should (IMHO) know better >> >continue to reply to him. I periodically check the [folder name >> >censored] >> >where my filter dumps his posts; I've yet to encounter a reason to >> >disable >> >the filter. >> >-- >> > >> >Cheers, >> > >> >Scott S. Lawton >> >http://Classicosm.com/ - classic books > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFO7MOI7J99hVZuJcRAhQDAKCH/JGH2zLTDslARHOEIYEFhoqkAACfSnwU YoGM4Q72ariZqWK1c8Usx00= =LYIS -----END PGP SIGNATURE----- From Bowerbird at aol.com Sun Oct 22 12:12:22 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Oct 22 12:12:33 2006 Subject: =?ISO-8859-1?Q?re:=20[gutvol-d]=20Proposal:=20creation=20of=20ne?= =?ISO-8859-1?Q?w=20mailing=20list=20for=A0=20PG-related=20format=20and=20?= =?ISO-8859-1?Q?process=20debate?= Message-ID: <583.d050952.326d1c96@aol.com> and the ad hominem continues unabated, with the lynch mob having convinced itself that it speaks for the whole town. amusing. when was the last time there was an interesting thread here which i didn't bring into existence? do you think hundreds of lurkers are stupid -- can't tell who stays on-topic and who does not? i'm _voluntarily_ limiting myself to one post a day from now on out -- until/unless someone abuses my self-restraint -- so strain on our e-mailboxes downstream is clearly attributed to my detractors, who don't seem capable of talking about anything except me. what sad pathetic lives they must have if their attempts to bully me are the best part of it. it's ironic, because i _came_ here to talk to michael, but he doesn't even hang around here any more... at any rate, this is sunday's post. sorry for the detour. tomorrow i'll be back with some perl code that shows some things p.g. can accomplish with plain-text files. will my detractors add to this "open-source" effort? we'll see. but if i were you, i wouldn't hold my breath. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/52f77487/attachment.html From scott_bulkmail at productarchitect.com Sun Oct 22 15:45:41 2006 From: scott_bulkmail at productarchitect.com (Scott Lawton) Date: Sun Oct 22 15:56:10 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: References: Message-ID: > > I encourage people to take control of their own mailboxes. If you don't >> like reading postings from someone, filter them. > >The problem is the high number of people who seem to enjoy arguing with BB. Even though I banned him years ago, it's still not uncommon that I open the mailbox and find it stuffed full of e-mail quoting him in full and following with extended arguing. Many (though not all) of these replies include his name in the quoted portion, so those are also easy to filter. I do think that if "the usual suspects" filtered his posts and thus didn't reply, the noise level would go way down. That's an improvement over the current situation, but still has a potential drawback. Even if nearly everyone stopped taking the bait, bb could still undermine the list by continuing to post. It would be into a vacuum for those of us who filter, but not for everyone -- e.g. not for new folks on the list, and not for misc. people who come across the list archives. To someone who doesn't know the history, it looks downright rude that a whole bunch of posts have no replies. The general (though not universal) sentiment seems to be that bb is an unwelcome guest. Posting to the list is not a right, e.g. outright spam is certainly not allowed. So, I think the community would be better off by a ban. Anyone can go back thru the list archives to see why that step was taken (even if in the end some don't agree with it). And, as noted, there's no harm in creating a brand new list for bb. Anyone who values the discussion can go there. Though IMHO PG shouldn't feel at all obligated to do so; there's no shortage of places where bb can host his own list. -- Cheers, Scott S. Lawton http://Classicosm.com/ - classic books From arnold.villeneuve at cirilab.com Sun Oct 22 17:15:56 2006 From: arnold.villeneuve at cirilab.com (Arnold Villeneuve) Date: Sun Oct 22 17:23:21 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections Message-ID: <006401c6f638$6985dda0$6501a8c0@TRIAGE1> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 11583 bytes Desc: not available Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/5cfdc3c8/attachment-0001.jpe From joshua at hutchinson.net Sun Oct 22 17:37:21 2006 From: joshua at hutchinson.net (joshua@hutchinson.net) Date: Sun Oct 22 17:37:34 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections Message-ID: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net> Well, two things. 1 - I have no idea what a knowledge map is and why it would be useful. Looking at the knowledge map for Mark Twain didn't explain what I was looking at (it seemed like a random collection of quotes with no discernable organization). A quick Google search gave a bunch of sites on Knowledge Maps but it still means nothing to me (all the sites I saw talked about orgnazing information, but not HOW it was organizing it). 2 - It is complete gibberish in Firefox. I only saw anything useful by using Internet Explorer. Since I don't use IE as my normal browser, I would have normally ignored any link that brought me to your site. I strongly suggest you fix that immediately. Josh ----Original Message----
From: arnold.villeneuve@cirilab.com
Date: Oct 22, 2006 20:15
To:
Subj: [gutvol-d] Knowledge Maps of Gutenberg Collections

Hello All

Cirilab Inc is a new company within The Gutenberg Project area. We are just beginning to see how our technology can leverage the vast Gutenberg warehouse of public domain books and writings.

What do we do? Cirilab creates Knowledge Maps of a collection of books / documents and Knowledge Views of individual books / documents.

What is our goal? Cirilab wants to create a Knowledge Map of the Top 100 Authors by download on Gutenberg as a first phase of our project.

What do we want? We really want to have input from Gutenberg Volunteers regarding our Knowledge Maps. We would really like the Gutenberg Volunteers to shape the development of Knowledge Maps of Authors works that are available on the Gutenberg website.

Here are a few examples of some of the work we have done with The Gutenberg Project so far. Please remember that these are just examples and that they are in early development. We produced them so that you would have something to evaluate.

http://www. cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index. htm

http://www.cirilab. com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index. htm

http: //www.cirilab. com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index. htm

As per The Gutenberg Project, 20% of the profits generated from ads within the Knowledge Maps will be donated to the cause, which is part of our goal. We would be considered under the Partners, Affiliates, and Resources section of Gutenberg’s website. Eventually, we would like to get to a place where Gutenberg volunteers are satisfied with our Knowledge Maps so that they can be listed with each author we have done one for.

I look forward to hearing from you. We really want the most important people at The Gutenberg Project, Volunteers, to be driving the direction of this effort.

Arnold Villeneuve

Vice President

www.cirilab.com

http://knowledgeuser.typepad. com

613-833- 0984


From donovan at abs.net Sun Oct 22 17:41:09 2006 From: donovan at abs.net (D Garcia) Date: Sun Oct 22 18:17:40 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections In-Reply-To: <006401c6f638$6985dda0$6501a8c0@TRIAGE1> References: <006401c6f638$6985dda0$6501a8c0@TRIAGE1> Message-ID: <200610222041.10067.donovan@abs.net> On Sunday 22 October 2006 08:15 pm, Arnold Villeneuve wrote: > Cirilab Inc is a new company within The Gutenberg Project area. We are just > beginning to see how our technology can leverage the vast Gutenberg > warehouse of public domain books and writings. Hey, now that's refreshing! Normally, when one is attempting to promote a commercial venture towards a potential partner, *working* examples are given. :) Now, for bonus points: Where is The Gutenberg Project Area in relation to Area 51? From scott_bulkmail at productarchitect.com Sun Oct 22 18:02:03 2006 From: scott_bulkmail at productarchitect.com (Scott Lawton) Date: Sun Oct 22 18:29:51 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections In-Reply-To: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net> References: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net> Message-ID: >2 - It is complete gibberish in Firefox. To Cirilab: you may also want to run the pages thru http://validator.w3.org/ -- Cheers, Scott S. Lawton http://Classicosm.com/ - classic books From cannona at fireantproductions.com Sun Oct 22 18:37:38 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Oct 22 18:37:59 2006 Subject: [gutvol-d] Fw: Gutenberg Republisher Update from Cirilab Message-ID: <002801c6f643$daa50f50$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a message that was sent to cd@pglaf.org from the same individual. I don't know if this sheds any more light on the subject or not. Anyway, I checked wikipedia and there is no article on knowledge maps, so I don't know either. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Arnold Villeneuve" To: ; ; ; ; ; ; ; Sent: Saturday, October 14, 2006 1:53 PM Subject: RE: Gutenberg Republisher Update from Cirilab > Hello Again > > > > I forgot to mention that Cirilab Inc would like to have Gutenberg > participate in its Affiliate Program so that when people purchase our > software as a result of a link from the Gutenberg website or a Gutenberg > link within one of the Knowledge Maps created from its Public Domain > documents. For each purchase we would pay Gutenberg 20% of the purchase > price. > > > > Can someone please let me know how we would go about this. > > > > We would also like to include the Cirilab Gutenberg Library of Knowledge > Maps on the Gutenberg Affiliate page. > > > > Arnold Villeneuve > > Vice President > > www.cirilab.com > > http://knowledgeuser.typepad.com > > 613-833-0984 > > _____ > > From: Arnold Villeneuve [mailto:arnold.villeneuve@cirilab.com] > Sent: October 14, 2006 11:03 AM > To: 'arnold.villeneuve@cirilab.com'; 'help@pglaf.org'; 'errata@pglaf.org'; > 'catalog@pglaf.org'; 'copyright@pglaf.org'; 'cd@pglaf.org'; > 'hart@pobox.com'; 'gbnewby@pglaf.org' > Subject: Gutenberg Republisher Update from Cirilab > > > > Hello > > > > Here is an example of what we can do with Gutenberg Public Domain content. > This example is of the 13 works of Winston Churchill. > > > > http://www.cirilab.com/TSMap/Cirilab_Library/Literature/winston_churchill/in > dex.htm > > > > The Gutenberg logo and link is on every Knowledge Map page and > additionally > on every individual Knowledge View of a book. The entire document is the > original Gutenberg document. > > > > We have also created a link on Wikipedia to the Knowledge Map. > > > > > > > > > > I have two questions for the Gutenberg people at this time: > > > > 1. Is the republishing of the Gutenberg public domain documents within > the Knowledge Map acceptable to Gutenberg? > 2. How can Cirilab create and publish a Knowledge Map right on the > Gutenberg web site for the Top 100 authors to begin with? In other words, > when someone looks at a specific author's collection of works on > Gutenberg, > we would like to have a Knowledge Map link of their work on that page so > the > reader can navigate the collection thematically. > > > > Please let us know what you think of the Winston Churchill Knowledge Map. > > > > Arnold Villeneuve > > Vice President > > www.cirilab.com > > http://knowledgeuser.typepad.com > > 613-833-0984 > > _____ > > From: Arnold Villeneuve [mailto:arnold.villeneuve@cirilab.com] > Sent: October 7, 2006 9:55 PM > To: 'help@pglaf.org'; 'errata@pglaf.org'; 'catalog@pglaf.org'; > 'copyright@pglaf.org'; 'cd@pglaf.org'; 'hart@pobox.com'; > 'gbnewby@pglaf.org' > Subject: Gutenberg Republisher Request > > > > Hello > > > > I'm not really sure who I should be making this request to so I'm writing > to > all of you in the hopes that someone will be able to point me in the right > direction. > > > > Cirilab provides Information Triage technology that allows people to > review > great volumes of data more quickly and more precisely. We are now creating > a > library of Public Domain content and want to enter discussions with The > Gutenberg Project in order to properly access the Public Domain documents > you provide while ensure we adhere to your republication licensing > requirements. > > > > We are particularly interested in establishing a protocol whereby we can > create Cirilab Knowledge Maps of each author that The Gutenberg Project > has > available and do it in an automated way in order to make the process more > efficient and maintainable as The Gutenberg Project updates its content. > > > > Can the appropriate person at The Gutenberg Project please contact me > directly to begin these discussions. > > > > If we can make this work, Cirilab will ensure that a portion of any > revenue > we generate is donated to The Gutenberg Project as part of the partnership > arrangement. > > > > We look forward to hearing from you soon. > > > > Arnold Villeneuve > > Vice President > > www.cirilab.com > > http://knowledgeuser.typepad.com > > 613-833-0984 > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFFPBzwI7J99hVZuJcRAn66AJoCy778jNSPd4xIGeq74Ak3sALUSwCfZz/w K9mwTxZnMfd1XEJ8pX5FcLY= =Q4dr -----END PGP SIGNATURE----- -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 58456 bytes Desc: not available Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/bcba097e/image001-0001.jpg From arnold.villeneuve at cirilab.com Sun Oct 22 18:48:10 2006 From: arnold.villeneuve at cirilab.com (Arnold Villeneuve) Date: Sun Oct 22 18:48:58 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections In-Reply-To: Message-ID: <00b101c6f645$4be51b00$6501a8c0@TRIAGE1> Hello Onorio Thank you for taking the time to provide some feedback. I really appreciate it. We are trying to improve what we do and your comments are important to us. You are not the first person to request better Firefox or other web browser support. Your comments will help me raise the issue within my own company so that I can ensure that we do eventually provide better support for open source browsers. Again, thank you sincerely for your $0.02 cents. It's worth a lot to us. Arnold Villeneuve Vice President www.cirilab.com http://knowledgeuser.typepad.com 613-833-0984 _____ From: catenacci@gmail.com [mailto:catenacci@gmail.com] On Behalf Of Onorio Catenacci Sent: October 22, 2006 9:39 PM To: arnold.villeneuve@cirilab.com; Project Gutenberg Volunteer Discussion Subject: Re: [gutvol-d] Knowledge Maps of Gutenberg Collections On 10/22/06, Arnold Villeneuve wrote: Hello All Cirilab Inc is a new company within The Gutenberg Project area. We are just beginning to see how our technology can leverage the vast Gutenberg warehouse of public domain books and writings. What do we do? Cirilab creates Knowledge Maps of a collection of books / documents and Knowledge Views of individual books / documents. What is our goal? Cirilab wants to create a Knowledge Map of the Top 100 Authors by download on Gutenberg as a first phase of our project. What do we want? We really want to have input from Gutenberg Volunteers regarding our Knowledge Maps. We would really like the Gutenberg Volunteers to shape the development of Knowledge Maps of Authors works that are available on the Gutenberg website. Here are a few examples of some of the work we have done with The Gutenberg Project so far. Please remember that these are just examples and that they are in early development. We produced them so that you would have something to evaluate. http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index.htm http://www.cirilab.com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doy le/index.htm http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/in dex.htm As per The Gutenberg Project, 20% of the profits generated from ads within the Knowledge Maps will be donated to the cause, which is part of our goal. We would be considered under the Partners, Affiliates, and Resources section of Gutenberg's website. Eventually, we would like to get to a place where Gutenberg volunteers are satisfied with our Knowledge Maps so that they can be listed with each author we have done one for. I know I shouldn't but I automatically tend to think less of webpages that are only viewable with IE. Especially considering the sort of person who's likely to volunteer to help with PG, this is a really glaring omission. It seems to me that a lot of PG's work is leveraged on open standards--which basically seems to be the antithesis of "Best Viewed With IE" webpages. Just my $.02. -- Onorio -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/e24241bc/attachment.html From Catenacci at Ieee.Org Sun Oct 22 18:50:36 2006 From: Catenacci at Ieee.Org (Onorio Catenacci) Date: Sun Oct 22 18:50:39 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections In-Reply-To: <00b101c6f645$4be51b00$6501a8c0@TRIAGE1> References: <00b101c6f645$4be51b00$6501a8c0@TRIAGE1> Message-ID: On 10/22/06, Arnold Villeneuve wrote: > > > > > Hello Onorio > > > > Thank you for taking the time to provide some feedback. I really appreciate > it. We are trying to improve what we do and your comments are important to > us. You are not the first person to request better Firefox or other web > browser support. > > > > Your comments will help me raise the issue within my own company so that I > can ensure that we do eventually provide better support for open source > browsers. > > > > Again, thank you sincerely for your $0.02 cents. It's worth a lot to us. > Don't think of it as supporting open source browsers. Think of it as supporting web standards. -- Onorio From Catenacci at Ieee.Org Sun Oct 22 18:38:32 2006 From: Catenacci at Ieee.Org (Onorio Catenacci) Date: Mon Oct 23 00:56:38 2006 Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections In-Reply-To: <006401c6f638$6985dda0$6501a8c0@TRIAGE1> References: <006401c6f638$6985dda0$6501a8c0@TRIAGE1> Message-ID: On 10/22/06, Arnold Villeneuve wrote: > > Hello All > > > > Cirilab Inc is a new company within The Gutenberg Project area. We are > just beginning to see how our technology can leverage the vast Gutenberg > warehouse of public domain books and writings. > > > > What do we do? Cirilab creates Knowledge Maps of a collection of books / > documents and Knowledge Views of individual books / documents. > > > > What is our goal? Cirilab wants to create a Knowledge Map of the Top 100 > Authors by download on Gutenberg as a first phase of our project. > > > > What do we want? We really want to have input from Gutenberg Volunteers > regarding our Knowledge Maps. We would really like the Gutenberg Volunteers > to shape the development of Knowledge Maps of Authors works that are > available on the Gutenberg website. > > > > Here are a few examples of some of the work we have done with The > Gutenberg Project so far. Please remember that these are just examples and > that they are in early development. We produced them so that you would have > something to evaluate. > > > > http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index.htm > > > > > http://www.cirilab.com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index.htm > > > > > http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index.htm > > > > As per The Gutenberg Project, 20% of the profits generated from ads within > the Knowledge Maps will be donated to the cause, which is part of our goal. > We would be considered under the Partners, Affiliates, and Resources section > of Gutenberg's website. Eventually, we would like to get to a place where > Gutenberg volunteers are satisfied with our Knowledge Maps so that they can > be listed with each author we have done one for. > I know I shouldn't but I automatically tend to think less of webpages that are only viewable with IE. Especially considering the sort of person who's likely to volunteer to help with PG, this is a really glaring omission. It seems to me that a lot of PG's work is leveraged on open standards--which basically seems to be the antithesis of "Best Viewed With IE" webpages. Just my $.02. -- Onorio -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/34f96747/attachment.html From schultzk at uni-trier.de Mon Oct 23 01:02:30 2006 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Mon Oct 23 01:02:36 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: References: Message-ID: Hi There, Am 20.10.2006 um 18:50 schrieb Bowerbird@aol.com: > keith said: > > Do to its nature multi-line parsing or splitting is not quite > that easy > > maybe. but we'll make it work. :+) > > > > split is nice. But you want to be doing parsing > > on a personal note, any time i call what i'm doing "parsing", > it goes badly. but as soon as i call it by _another_ name, > the same thing with the same code, it starts working better. > so i've grown allergic to that word, and i almost never use > it. :+) > As you mentioned below. You want something quick and dirty. Which will get 80% of the way. Just like word for word translation. > however, i assume that you're talking about "parsing" in the > "let's parse the dom tree" sense. (no, i don't even know what > that means, so i might well have misused it, which would be > poetic in its own way.) > > that kind of "parsing" would make our code very complicated. > > and in the same way that i don't like my format to be complex, > i don't like my programs to be complex. so i make them simple. > > and what i'm showing people here is how much mileage can be > obtained out of the simple combination of a simple format and > some simple programs. that's the whole purpose of this exercise. > > so just stick with me for a little bit on "split", and see some > tricks. > > (and, just to be clear, although you might think this is related to > z.m.l., and thus can be swiftly relegated to the "i don't care" pile, > the truth of the matter is that since virtually all of the books in > the p.g. library have a plain-ascii representative, one that is close > to z.m.l. format and perhaps even exact, the code that i'm showing > here could also be used to great effect on the library as it stands. > there are a lot of neat features that could be offered with very > little > work or trouble by using the simple code routines i'll reveal here. > just as an example, how about a simple script that would give us > a list of the section-headers for every book in the entire library? > i don't know about you, but i think this "super table of contents" > for the entire library would be very cool, and likely quite useful. > and within a week or two of these little daily lessons, we'll have > it.) > > > > Just for the fun of it your script is incomplete. > > good observation, keith. now tell us why, and complete it... ;+) I am to lazy to write the code but you simply forgot to close a few html- tags no biggy. ;-)) As To Marcello suggestion about the file name one could write the case to check for the filename or make your code a procedure and write wrappers for a by case usage. Or.. Nahhh to complicated. reagrds Keith. P.S. Do I see it right where in Perl 101 why we try to top each other with the fastest and shortest code. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061023/12ae197b/attachment-0001.html From schultzk at uni-trier.de Mon Oct 23 01:14:48 2006 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Mon Oct 23 01:14:53 2006 Subject: [gutvol-d] here's the perl code for babelfish assignment 01 In-Reply-To: <4539088B.2070009@perathoner.de> References: <380.f7cf57f.326a50ff@aol.com> <4539088B.2070009@perathoner.de> Message-ID: <844C5014-F134-4787-9D67-B9ED5E980AE6@uni-trier.de> Hi There, Am 20.10.2006 um 19:34 schrieb Marcello Perathoner: > Bowerbird@aol.com wrote: > >> other times, when the script is sitting on your website and to be >> called, >> you'll want it to read the parameters as passed from the calling >> script. >> so, for instance, if we were to pass the filename starting in >> column 12: >>> read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); >>> $thefilename=substr($buffer,12); > > You are just cutting and pasting out of some perl cgi tutorial. You > don't have the least idea what is going on. > Do you know what you are doing ?? It always depends on how the code is called GET, POST or in the URL or you have stuff it in a cookie or even a browser variable. I guess we are back to Programming 101 and nit picking. Keith. From lee at novomail.net Mon Oct 23 09:45:58 2006 From: lee at novomail.net (Lee Passey) Date: Mon Oct 23 09:44:26 2006 Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related format and process debate In-Reply-To: References: Message-ID: <453CF1C6.9040401@novomail.net> Dave Doty wrote: >> From: gbnewby@pglaf.org > >> I encourage people to take control of their own mailboxes. If you >> don't like reading postings from someone, filter them. > > The problem is the high number of people who seem to enjoy arguing > with BB. Even though I banned him years ago, it's still not uncommon > that I open the mailbox and find it stuffed full of e-mail quoting > him in full and following with extended arguing. The problem isn't > even BB himself, but that this has become the BB Forum, and that > debating him seems to take up more space than everything else > PG-related. Other than banning half the list, most of whom have > things worth saying in other contexts, how can I take control of my > mailbox to deal with this? Or is it a case of "put up with it or > leave?" By all objective measures, even at its busiest gutvol-d is a relatively low volume list. It is also (for me, at any rate) a relatively low priority list. Thus, what I have done is set a filter to automatically route /all/ traffic from gutvol-d into a gutvol-d folder. That way the posts do not disrupt my daily work-flow and I can choose the time to look at them; thus segregated, I can fairly quickly determine which messages deserve attention, and which can be consigned to the bit-bucket. -- Nothing of significance below this line. From sly at victoria.tc.ca Mon Oct 23 09:55:06 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Oct 23 09:55:12 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg Message-ID: It might be of interest to some here to take a look at the paper "Limits of self-organization: Peer production and laws of quality" by Paul Duguid. http://www.firstmonday.org/issues/issue11_10/duguid/index.html It contains some criticism of Project Gutenberg, particularly of PG#1079, Tristam Shandy. Andrew From prosfilaes at gmail.com Mon Oct 23 11:42:50 2006 From: prosfilaes at gmail.com (David Starner) Date: Mon Oct 23 11:42:55 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg In-Reply-To: References: Message-ID: <6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com> On 10/23/06, Andrew Sly wrote: > > It might be of interest to some here to take a look at the paper > "Limits of self-organization: Peer production and laws of quality" > by Paul Duguid. > > http://www.firstmonday.org/issues/issue11_10/duguid/index.html > > It contains some criticism of Project Gutenberg, particularly > of PG#1079, Tristam Shandy. It frustrates me that people keep nitpicking our translations. Yes, many translations aren't the greatest in the world. But translations stand independent of the original; how is someone supposed to really understand the note at the front of the Penguin edition without a copy of the earlier translation to compare it to? Oh, yeah, and apparently his computer can't read Latin-1 properly, and he blames us. From lee at novomail.net Mon Oct 23 14:32:13 2006 From: lee at novomail.net (Lee Passey) Date: Mon Oct 23 14:30:46 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg In-Reply-To: References: Message-ID: <453D34DD.2030703@novomail.net> Andrew Sly wrote: > It might be of interest to some here to take a look at the paper > "Limits of self-organization: Peer production and laws of quality" > by Paul Duguid. > > http://www.firstmonday.org/issues/issue11_10/duguid/index.html > > It contains some criticism of Project Gutenberg, particularly > of PG#1079, Tristam Shandy. > > Andrew Thanks for the highly interesting link. I have a few quibbles with the analysis, but it was very enlightening nonetheless. The biggest problem with the PG analysis, in my mind, is that while he identified some real and serious concerns, there was no suggestion of systemic changes which could be made to resolve those concerns. -- Nothing of significance below this line. From ian at babcockbrown.com Mon Oct 23 14:56:31 2006 From: ian at babcockbrown.com (Ian Stoba) Date: Mon Oct 23 15:45:58 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg In-Reply-To: <6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com> References: <6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com> Message-ID: <6F378C93-F5A8-4AB7-98F0-BFD295178B3A@babcockbrown.com> On Oct 23, 2006, at 11:42 AM, David Starner wrote: > On 10/23/06, Andrew Sly wrote: >> >> It might be of interest to some here to take a look at the paper >> "Limits of self-organization: Peer production and laws of quality" >> by Paul Duguid. >> >> http://www.firstmonday.org/issues/issue11_10/duguid/index.html >> >> It contains some criticism of Project Gutenberg, particularly >> of PG#1079, Tristam Shandy. > > It frustrates me that people keep nitpicking our translations. Yes, > many translations aren't the greatest in the world. But translations > stand independent of the original; how is someone supposed to really > understand the note at the front of the Penguin edition without a copy > of the earlier translation to compare it to? > > Oh, yeah, and apparently his computer can't read Latin-1 properly, and > he blames us. I thought the article was interesting and it raised two valid points, neither of which was really central to the paper's main question about the portability of the open source model to projects other than software development. First: It is very difficult to create an accurate e-book for a printed book in which typography and design are integral to the author's creation. This problem is not unique to PG, by any means, and Duguid is correct to point out that editorial decisions are made in the process of creating an e-book. Again, these are both artifacts of the conversion from printed page to binary bits and are true for all e-book efforts, not just PG. The part which does have the most direct bearing on PG is the fact that some books are extremely difficult to present accurately in ASCII text, and Tristam Shandy certainly falls into this category. I still found myself wondering: is there some system of organization that could have done a better job rendering this complex work in ASCII? I think the shortcomings of this e-book are much more due to the inherent difficulty of rendering the text than they are to anything involving the structure of the PG volunteer group. Second: PG ultimately aspires to being a repository of every public domain work on the planet. By definition that includes multiple editions of different works. The question of which edition gets digitized first depends on a number of factors. Duguid is correct in identifying that both newer editions (which may be encumbered with copyrights for introductions and other new content) and older editions (which may be too valuable or delicate to scan, or may simply be unavailable) may not be practical. This leaves a lot of Victorian era editions of many works available as source materials. In some cases, the editions available to PG to scan may have been Bowdlerized and may no longer reflect the author's original intent. The point is valid, but I don't see anything obvious that could be changed to make the situation better. The Million Book Project and Google both seem to face similar challenges in their efforts to scan public domain works. So ultimately, like everything involving humans, there are things in PG e-books that are imperfect and Duguid has pointed out two of them. Unfortunately, it does not seem to me that there are any practical structural or procedural changes that could be made that would address these issues. Perhaps high resolution page scans from a first edition are the best way to read an e-book of Tristam Shandy, but that is not a practical option for most readers. On balance, is the current (imperfect) version of the e-book better than not having a free e-book of Tristam Shandy at all? I think it is, but I would be interested to hear differing opinions. --Ian > This email message may contain information that is confidential and proprietary to Babcock & Brown or a third party. If you are not the intended recipient, please contact the sender and destroy the original and any copies of the original message. Babcock & Brown takes measures to protect the content of its communications. However, Babcock & Brown cannot guarantee that email messages will not be intercepted by third parties or that email messages will be free of errors or viruses. If you do not wish to receive any further e-mail from Babcock & Brown, please send an email to opt-out@babcockbrown.com. From sly at victoria.tc.ca Mon Oct 23 22:49:45 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Oct 23 22:49:49 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg In-Reply-To: <453D34DD.2030703@novomail.net> References: <453D34DD.2030703@novomail.net> Message-ID: Another point is that, as someone very involved with PG, I know that something with a PG number as low as 1079 is more likely to have certain inconsistencies or problems than a more recent release might have. However, to someone else, such as Paul Duguid, it is taken as being being representative of the whole collection. To put it in perspective, this text is an example of part of a process (which is still ongoing) of volunteers discovering what works over time. (An issue in this case putting footnotes inline, surrounded by brackets.) If we were to imagine an alternate reality where PG was a top-down organization, attempting to enforce sets of strict rules, it could easily be the case that this would have been put aside and not posted yet (nine years later). Is it possible, we could still be having arguments about the "proper" way to represent a blank, black page? Re: Ian's comment about challenges of representing typography and design elements in digital transcriptions. Yes, as you say, this is a challenge for any group, not just PG. Some people have tried to preserve information relating to the digitization process. I've adapted dozens of texts for PG from other online sources, and I am no longer surprised to find examples where a text is very meticulously labelled with bibliographic data and so forth, (which makes it appear very scholarly and acceptable); only to examine it more closely and find out it is not actually from the source which it claims, or that the preparer has put much effort into documenting facts like smudged page numbers--while neglecting to fix many ocr scannos, etc. Andrew From radicks at bellsouth.net Tue Oct 24 08:02:58 2006 From: radicks at bellsouth.net (Dick Adicks) Date: Tue Oct 24 08:03:04 2006 Subject: [gutvol-d] Paper which mentions Project Gutenberg In-Reply-To: Message-ID: Andrew, it's worth noting that the critic adds the following qualification: "I do not want the arguments above to suggest that Gracenote is worthless or Project Gutenberg useless. Far from it. Both are immensely useful. Nonetheless, both suffer from problems of quality that are not addressed by what I have called the laws of quality ? the general faith that popular sites that are open to improvement iron out problems and continuously improve. In the case of Gracenote, it may be that only users with minority tastes suffer and they should be prepared to look after themselves. In the case of Project Gutenberg, by contrast, the Project does greatest disservice to those it most seeks to serve, the general reader who may not know enough about the texts he or she is reading to be able to distinguish nonsense from complexity, editorial misjudgment from authorial teasing, bowdlerization from Nordic prudery. In both cases, whether to guide users better or to improve the system, these limitations need to be recognized." He acknowledges the "immense usefulness" of PG, but he calls for a more careful quality control. Haste makes waste. His criticism is worth heeding for a volunteer effort that works _sub specie aeternitatis_. Dick Adicks > From: Andrew Sly > Reply-To: Project Gutenberg Volunteer Discussion > Date: Mon, 23 Oct 2006 22:49:45 -0700 (PDT) > To: Project Gutenberg Volunteer Discussion > Subject: Re: [gutvol-d] Paper which mentions Project Gutenberg > > > Another point is that, as someone very involved with PG, I know that > something with a PG number as low as 1079 is more likely to have certain > inconsistencies or problems than a more recent release might have. > However, to someone else, such as Paul Duguid, it is taken as being > being representative of the whole collection. > > To put it in perspective, this text is an example of part of a process > (which is still ongoing) of volunteers discovering what works over time. > (An issue in this case putting footnotes inline, surrounded by brackets.) > If we were to imagine an alternate reality where PG was a top-down > organization, attempting to enforce sets of strict rules, it could > easily be the case that this would have been put aside and not posted > yet (nine years later). Is it possible, we could still be having arguments > about the "proper" way to represent a blank, black page? > > Re: Ian's comment about challenges of representing typography and design > elements in digital transcriptions. Yes, as you say, this is a challenge > for any group, not just PG. Some people have tried to preserve information > relating to the digitization process. I've adapted dozens of texts for PG > from other online sources, and I am no longer surprised to find examples > where a text is very meticulously labelled with bibliographic data and so > forth, (which makes it appear very scholarly and acceptable); only to > examine it more closely and find out it is not actually from the source > which it claims, or that the preparer has put much effort into documenting > facts like smudged page numbers--while neglecting to fix many ocr scannos, > etc. > > > > > Andrew > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From sly at victoria.tc.ca Tue Oct 24 18:01:18 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Oct 24 18:01:53 2006 Subject: [gutvol-d] Morse code Message-ID: Ok, here's something to file under "unanticipated uses of PG texts"... "A Princess of Mars" converted to Morse Code http://www.hotpeppersoftware.com/downloads/downloads.html Andrew From hyphen at hyphenologist.co.uk Tue Oct 24 19:27:38 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Tue Oct 24 19:27:54 2006 Subject: [gutvol-d] Morse code In-Reply-To: References: Message-ID: On Tue, 24 Oct 2006 18:01:18 -0700 (PDT), Andrew Sly wrote: | |Ok, here's something to file under "unanticipated |uses of PG texts"... | |"A Princess of Mars" converted to Morse Code |http://www.hotpeppersoftware.com/downloads/downloads.html If it does not conform to the W3 standard surely we can not use it ;-) -- Dave Fawthrop "Intelligent Design?" my knees say *not*. "Intelligent Design?" my back says *not*. More like "Incompetent design". Sig (C) Copyright Public Domain From Bowerbird at aol.com Fri Oct 27 12:01:09 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Oct 27 12:01:23 2006 Subject: [gutvol-d] the peace and quiet Message-ID: gosh, the peace and quiet here has been so pleasant that i've been totally reluctant to disturb it, even with just one post a day, even for our open-source project. so now i've stored up a credit backlog for several posts. maybe we'll start up again next monday. stay tuned... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061027/325678b2/attachment.html From hyphen at hyphenologist.co.uk Fri Oct 27 14:11:11 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Fri Oct 27 14:11:24 2006 Subject: [gutvol-d] the peace and quiet In-Reply-To: References: Message-ID: On Fri, 27 Oct 2006 15:01:09 EDT, Bowerbird@aol.com wrote: |gosh, the peace and quiet here has been so pleasant Well $?%$?%$&^%$ leave it that way ;-( -- Dave Fawthrop For Yorkshire Dialect http://www.gutenberg.org/author/John_Hartley http://www.gutenberg.org/author/F_W_Moorman 19,000 free e-books at Project Gutenberg! http://www.gutenberg.org From Bowerbird at aol.com Sat Oct 28 20:49:01 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Oct 28 20:49:09 2006 Subject: [gutvol-d] the peace and quiet Message-ID: <513.6734f940.32757ead@aol.com> dave said: > Well $?%$?%$&^%$ leave it that way please don't say "$?%$?%$&^%$" unless you really mean it... ;+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061028/8f41a8f8/attachment.html From Bowerbird at aol.com Mon Oct 30 13:57:50 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 30 13:58:00 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here Message-ID: hi. this is my one post for 2006-october-30rd. it's long, so please feel free to read it in chunks. (there is an "executive summary" at the bottom.) *** welcome back to "babelfish", our little "open-source" project here on gutvol-d. first, big props to jeroen hellingman for joining in with the "open-source" spirit. as he announced on the gutvol-p list, jeroen has created roughly _200_ t.e.i. e-texts that are now in the p.g. library, and he's made his various tools available. since his .tei isn't the same as the "official" .pgtei, he's put only his .html and .pdf in the p.g. library so far. but maybe one day, there will exist a conversion routine that morphs jeroen's .tei into the "official" one. in the meantime, check out his free tools! *** and now to respond to some feedback from the announcement i made about this project, including some code that i released (which is repeated below for your convenience)... keith said: > As you mentioned below. You want something quick and dirty. > Which will get 80% of the way. Just like word for word translation. this code is "proof-of-concept", so the goal is 100% functionality. and since the main thing i am demonstrating is that it will be _simple_ to write such code, i must also aim at "code simplicity". (but since i'm just a beginner with perl, that'll come naturally...) as long as i get reasonably close on those two -- proof-of-concept and simplicity of the code -- i'll happily settle for as low as 10% on other variables, like speed, size, elegance, code beauty, and so on... once a piece of code does what i want it to do, i'll move on. as "open-source" code, i have released it early, and expect that if other people like its functionality, they'll set out to improve it. i'm just a beginner in perl, anyway, so it's unlikely i would be able to smooth the code to professional level anyway, but nonetheless, given my clear objectives here, there's no reason for me to do it... my one and only mission now is to demonstrate the viability of "zen markup language" toward creating a high-powered library of plain-text files simple enough for a 4th-grader to maintain... i don't even care if project gutenberg implements these features, because i will be including them in my mirror of the p.g. library... i only wish to show here how easy it is put 'em into play, since they can be realized with just a few lines of code written by a beginner... > As To Marcello suggestion about the file name > one could write the case to check for the?filename > or make your code a procedure and write wrappers > for a by case usage. Or.. Nahhh to complicated. listening to marcello right now will only make you confused. stick with me for right now. i'll tell you everything you need. > I am to lazy to write the code but you simply > forgot to close a few html- tags. no biggy. ;-) some of that is intentional -- the "body" and "html" close tags -- because i might want to have my script append something else to that web-page in some of my experimenting down the line... any other ones -- like "pre" -- are just because i didn't care; the browser closes all the tags when it hits page-end anyway. > P.S. Do I see it right where in Perl 101 why we try to > top each other with the fastest and shortest code.?? again, that's not my game here, i'm doing proof-of-concept, but if you wanna play that, go ahead, it can be lots of fun... :+) i _am_ looking for _simple_ code, however. so if you show me an easier way of accomplishing some task, i might well adopt it. (except in cases where i'm gonna leverage my way down the line.) but as you will see, most of my routines are in the neighborhood of just a couple of lines anyway, so i don't think it gets more simple... *** keith said: > Do you know what you are doing? i know my code works for me. that's all i need to know right now. > It always depends on how the code is called GET, POST or > in the URL or you have stuff it in a cookie or even a browser variable. if my code doesn't work for you, or you can see cases where it won't work under certain circumstances, do please let me know. but what i'm doing is simple enough that i doubt that will happen. (in a backchannel after having written this, i learned that keith was directing those remarks at marcello, not at me. but as i told him, i like to ask myself the "do you know what you're doing?" question on a regular basis. it's good for grounding and self-improvement.) *** when i came on this listserve 3 years ago, it was to tell people that heavy markup of the project gutenberg library is unnecessary. it is still unnecessary. for some people -- people who have made a significant investment of time and energy, (and life!), in trying to learn heavy markup -- this is _not_ a welcome message, and they would like to not have to hear it, sometimes to the point of keeping _you_ from hearing it. so they have tried to shout me down. and they have tried to get me banned. and they have tried to discredit me. at any time along the way, i could have provided enough irrefutable truth that they would have simply had to back off. but what fun would that have been? :+) so i decided to toy with them instead. much like a cat plays with a mouse. a mouse who thinks he's an elephant. on occasion, i would let them think they might have me "cornered", or "vulnerable". i wanted to see how brazen they would be. and boy could they be brazen! :+) at any rate, the time has come for proof. and the proof is in the pudding. eat it... play-time is over. we're getting real now. i expect that the name-calling will escalate, for a short time anyway. not long after that, they'll realize that their cause is hopeless, and give up. but _until_ then, they will do whatever they think they can get away with, to make you stop listening to me, so you won't see my proof. but it doesn't matter whether you see it or not, because this truth has a strength that will win... when michael hart insisted that project gutenberg was about _the_words_, and not fancy formatting, he was exercising a very insightful vision, because years later -- today -- we can make the formatting automatic, if we have the words in their right order. i will now prove that michael hart was right... *** as i said earlier, my point in organizing our open-source project is to demonstrate that a few simple programming techniques can give us remarkable power when used on a simple consistent file-format (like z.m.l.)... we can certainly create the .html that serves as a rosetta stone to various e-book formats. hence the name of this project: "babelfish"... to orient you, i created this "top 10 list" of these simple programming techniques... 1. read files, from disk or websites. 2. write files, to disk or webpage. 3. split and join strings. 4. search strings for substrings. 5. get portions of strings. 6. do replacements in strings. 7. loops (if/then, for/next, while/wend). 8. i forget what 8 was for. 9. collect and pass on user input. 10. zip/unzip online files. that's it. heck, i don't even know how to do #10 myself yet -- but i assume that it is easy using perl -- and i might occasionally throw in another technique. but for the most part, it'll just be these "top 10"... so if you can do these 10 simple things in _your_ choice of a programming language, then you too will be able to use the pseudo-code that i give you. i know that most of you are _not_ programmers, at all. but do please keep reading, because what's important here is _not_ the programming per se, but the features -- the functionality -- so your eyeballs will work fine... besides, these features are targeted directly at _users_, so each of you can judge their efficacy and desirability as well as anyone else. (i assume you are all readers.) i promise i won't dwell on the code, i'll just list it out, so people who wanna run it for themselves can do so. but from my standpoint, the output is what's important. so for each fragment of code, i'll give you a web u.r.l. to load in your browser so you see the results _right_away_. you don't have to look at the code at all, if you don't want. just load the u.r.l. underneath it, and look at its _output_... the lesson of my mission: a dirt-simple format and dirt-simple code can yield tremendous functionality. *** i'm not going to talk very much about the z.m.l. format in the development of this open-source coding project, other than to describe what we will need to know about it to write our routines. of course, once you observe how the routines work on a file, you'll obtain a good understanding for why the "rules of z.m.l." were made exactly as they were. here is the main .zml file that we'll use for this project: > http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml this is "my antonia", by willa cather. many people have digitized this book. (thanks to jon noring for the scans.) it would be good if you took a close look at this .zml file. convince yourself that there are no "tricks" in it, that it is a plain-and-simple raw-ascii file -- carefully done, certainly, but nothing more than that. this .zml file is our _input_... we'll have various types of _output_ along the way, but if you want to know the main goal we are shooting for, you should take a close look at the sequence of files you can hook into here: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp001.html this particular set of files is a demo for "continuous proofreading" -- a system where the public does the "final" proofing on a book -- so it puts up the text for each page opposite the scan for that page, along with a form at the bottom that lets people report corrections. of course, as you will see, we could also develop _other_ interfaces. indeed, one of the things you will take away from this demonstration is that it can be extremely easy to set things up exactly as you like it... after all, that's one of the promises of open-source, isn't it? *** for contrast, if you'd like to see e-book versions from jon noring: > http://www.openreader.org/myantonia/ or here's an _excellent_ .pdf, a "digital reprint" from jose menendez: > http://www.ibiblio.org/ebooks/Cather/index.html jose has replicated the look of the original book, and has links that enable the reader to summon the scans for comparison. awesome! *** so here we go... *** to recap, our assignment #1 was to (a) read an e-text in, then (b) write it out to a webpage. here's the perl code: > #!/usr/bin/perl > ####################### read the file in, and write it out > ####################### read the file in... > $filename="/home2/yoursiteinfohere/public_html/myant/myant.zml"; > open (inf,"$filename") or print "that file was not available...

\n"; > read (inf,$thebook,2222222); close inf; > ####################### ...and write it out... > print "content-type: text/html\n\n"; > print ''; > print ''; > print ''; > print ''; > print 'babelfish!'; > print ''; > print ''; > print ''; > print '

';
>    print '

'; > ####################### ...and here's the money-shot. > print $thebook; > # pseudo-code: read the file in, and write it back out again you can see the results of this code by running this script: > http://www.greatamericannovel.com/scgi-bin/babelfish01.pl this e-text is over 400k, so it takes a little while to load, especially if you're on dialup. so we'll try and do something about that later... technically, we've translated the e-text into .html. so we're done. (just kidding...) ;+) *** assignment #2 is to _split_ the e-text by its pages. if you examine the file, you'll see that pagebreaks are indicated by double-curly-brackets. the name of the scan of that page is enclosed in those curly-brackets. right above that line, the last line of each page is its pagenumber, enclosed in double-standard-brackets. in step #1 above, we read the book into a string. now we'll "chunk" that string into its respective pages, just by doing a "split" on a pair of open-curly-brackets. (the "split" command splits the big string into a bunch of little ones, by using the "separator" as a split point.) here's the code, which you can append to the code above after discarding the last line of code (i.e., "print $thebook;"): > ####################### chunk it into pages and output 3 > @onepage=split("{{",$thebook); foreach $onepage (@onepage) { > $nn++; if ($nn eq "136" or $nn eq "253" or $nn eq "364") > {print $onepage; print '

'}}; > # pseudo-code: chunk into pages; output 136, 253, and 364. you can see the results of this new code by running the script: > http://www.greatamericannovel.com/scgi-bin/babelfish02.pl this truly minuscule amount of code chunks the file into pages (splitting on double-open-curly-brackets that indicate pages), then runs through each page, and prints three selected ones... (what are those pages, and why did _those_ pages come out? let's see you answer those questions, and comment on them.) this ability to use "split" to separate the entire file into "chunks" -- while simple to understand and program -- will provide us a _lot_ of power for handling the text, when we use it wisely... in particular, this ability to split the file into its respective pages -- with each chunk being the text that appeared on one page -- is important because it's the very first step in creating the set of .html pages that i pointed to up above which serves as our "goal": > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp001.html *** so, did you answer the question about what pages we got, and why? we got pages 111, 222, and 333. those are the pages i _wanted_. but the code actually asked for _chunks_ that were 136, 253, and 364. so how come we got pages 111, 222, and 333? well, it's because this book has several "unnumbered" pages in it. these pages include 2 "cover" pages (the cover and an added "hot" table of contents) plus 13 other pages of front-matter, and some _illustration_ pages sprinkled throughout the book. (plus those illustrations are repeated at the end of the book.) the "unnumbered" pages have chunks of text associated with them (even if only the name of the graphic-file holding that illustration), so we actually have more "chunks" of text than _numbered_pages_. which means the two numbering systems are not in sync. i had to go through some trial-and-error to discover the "chunk" numbers that i needed, in order to get pages 111, 222, and 333, as the chunks that i needed to request were 136, 253, and 364... but of course, we don't want to have to do such "trial-and-error" every time we want to display a specific page-number of text, so what we will do is _search_ the text of each page/chunk for the _pagenumber_ that we want. you will remember the pagenumber is enclosed in double-brackets as the last line of each page's text, so it's rather easy to do a search for it. when we find the chunks that contain "[[111]]" and "[[222]]" and "[[333]]", we will spit _those_ out. so here we go with assignment #3: output pages 111, 222, and 333. instead of the 3 lines that we had in the routine for babelfish02.pl, we'll use this code instead for this new assignment, babelfish03.pl. > ####################### output pages 111, 222, and 333 > @thepage=split("{{",$thebook); foreach $thepage (@thepage) { > $r1=index($thepage,"[[111]]"); $r2=index($thepage,"[[222]]"); > $r3=index($thepage,"[[333]]"); if ($r1 > -1 or $r2 > -1 or $r3 > -1) > {print $thepage; print '

'}}; > # pseudo-code: output pages with "[[111]]", "[[222]]", or "[[333]]" you can see the results of this code by running the next script: > http://www.greatamericannovel.com/scgi-bin/babelfish03.pl of course, the output here looks just like it did for babelfish02.pl; but we've got a more robust way of creating it now, which is good. *** so, to display the page we wanted, we learned how search the text. this naturally introduces us to the idea of searching for _words_, and displaying the pages that contain the terms we'd specified... so the next assignment is to write a _word-search_ routine; let's say the user had specified a search for the term "breeze". assignment #4: output all of the pages with the word "breeze". it's just a simple cut-back of the 5-line routine we just wrote... > ####################### output pages containing "breeze" > @thepage=split("{{",$thebook); foreach $thepage (@thepage) > {$result=index($thepage,"breeze"); > if ($result > -1) {print $thepage; print '

'}}; > # pseudo-code: output all pages that contain the term "breeze". you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish04.pl wow, that _was_ a breeze, wasn't it? a lot of power in those 3 lines. so you're beginning to get the picture. with two dozen lines of code, copied out of a _primer_ on perl, we've managed to cobble together the raw engine to do an electronic search (one of the most powerful of all the benefits offered by a switch from paper-books to e-books), and to display one page, so we don't have to load in the whole book. *** this ability to "split" the string can operate on a very granular level. we can split the string on _whitespace_ -- spaces, newlines,tabs -- such that every _word_ is treated distinctly. this means that we can uniquely identify, by number, each and every word in the entire file. so our next assignment is to do something along those very lines... assignment #5: number each word and output #75319-#75357. > ####################### chunk words, output 75319-75357 > print ""; > @oneword=split(" ",$thebook); foreach $oneword (@oneword) {$nn++; > if ($nn >= "75319" and $nn <= "75357") {print "$nn -- $oneword\n"}}; > # pseudo-code: chunk into words; output 75319-75357. you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish05.pl as you can see, this split and run through the words happens fast, especially when we're outputting a mere 303 words. if we output all of the words, it's kinda slow, so we'd need to speed it up if we wanted to put this in front of the public. but gosh, what power! not that people want to read a book with just one word per line, but our ability to be _specific_ in pinning down a certain word -- i.e., the 75,319th word in this file is "sunflower" -- could be quite useful if we ever need to do any integrity checks on the file. the ability to point to this particular sequence of words: > 75319 -- sunflower > 75320 -- stalk > 75321 -- and > 75322 -- clump > 75323 -- of > 75324 -- snow-on-the-mountain, > 75325 -- drew > 75326 -- itself > 75327 -- up > 75328 -- high > 75329 -- and > 75330 -- pointed; > 75331 -- the > 75332 -- very > 75333 -- clods > 75334 -- and > 75335 -- fur- > 75336 -- rows > 75337 -- in > 75338 -- the > 75339 -- fields > 75340 -- seemed > 75341 -- to > 75342 -- stand > 75343 -- up > 75344 -- sharply. > 75345 -- I > 75346 -- felt > 75347 -- the > 75348 -- old > 75349 -- pull > 75350 -- of > 75351 -- the > 75352 -- earth, > 75353 -- the > 75354 -- solemn > 75355 -- magic > 75356 -- that > 75357 -- comes with such a large degree of specificity is quite fantastic, and might come in quite handy when we start building our linkage capabilities. we might have dissent about how "word" is defined -- what with > 75324 -- snow-on-the-mountain, or > 75335 -- fur- > 75336 -- rows -- but given that anyone in the world will get this same sequence when running this same perl program on this same file, i'd think we can accept this output as is and still feel quite comfortable... *** a split that is even more useful is the one we can do on _lines_. so our next assignment, just a quick rewrite, is to do exactly that... assignment #6: number each _line_ and output _all_of_them_ > ####################### number the lines, and output all > @oneline=split("\n",$thebook); foreach $oneline (@oneline) { > $nn++; if ($nn >= "0" and $nn <= "99999") {print $oneline}}; > # pseudo-code: chunk into lines; number and output all of them you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish06.pl an excellent example of the power of a very small amount of code, it's extremely likely that we'll return to this capability down the line. indeed, i can guarantee we'll be using this routine more in the future, so we'll leave any further exposition of its magic for later... *** so far most of our "splits" have been on the p-book _pages_, but we can split on other stuff if we _want_, and we just might. one of the rules of .zml is that a new section is indicated by the presence of 4 or more blank lines before its header. so we might want to search for _headers_ by searching for (at least) 4 blank lines (i.e., 5 or more newline characters)... and yes, we _could_ just proceed as we did above for "breeze", and first split the book up into pages, and then search the text of each page for 5 consecutive newlines. sure, that would work. but we can also split the book -- in the first place -- by using 5 consecutive newlines as the splitter. what _that_ would do is split the book up into its _sections_, rather than its pages... (of course, in most p.g. e-texts, the "sections" are "chapters", so you will all understand if i use those terms interchangeably.) and since that will be a useful thing later, let's learn it now. you might remember i sought help on chunking a string by doing a split on _multiple-consecutive-line-endings_. i said: > and if someone would tell me how to do a "split" on > a sequence of multiple line-endings, that'd be great. > i assumed it would be something like this: > > @thesections=split('\n\n\n\n\n',$thebook); > but that doesn't appear to be working for me. marcello came in with an answer. i guess he couldn't bring himself to tell you that the command i gave there actually _is_ a correct specification for doing such a split. he had to make a little mod that made it _seem_ different. but the code i wrote there will indeed to the job just fine... the _reason_why_ "it doesn't appear to be working" is that that code will split on the _line-feed_ character that linux (and thus webservers) define as the "newline" character... but in actuality, the _input-file_ -- the "myant.zml" file -- had _carriage-returns_ as its linebreak characters, since i made it on a mac, and the mac uses a _carriage-return_ as its indicator of a "newline". welcome to the world of cross-platform incompatibilities. you might know windows has even a different convention; it uses a combination of a carriage-return-and-line-feed as _its_ "newline" indicator. because one incompatibility is never enough, is it? now, since your web-browser might well do you the favor of translating the carriage-returns in that myant.zml file into the appropriate newline character on _your_ machine when it displays that file to you, you might not have realized that that file itself has the "wrong" newline characters in it... but our little perl program takes each character _literally_... so when my "split" command -- and marcello's as well -- went looking for 5 consecutive line-feeds, it found _zero_ occurrences, so it didn't actually do the split we wanted... if you go looking for consecutive-line-feeds in a file that uses carriage-returns for its newlines, you won't find any line-feeds, not a one, let alone 5 consecutive ones. that is, the command didn't give us the _output_ we wanted, because the _input_ file wasn't exactly like we thought it was. thus, when i asked "what's wrong with this code?", i was asking the wrong question. there was _nothing_ wrong with the code; the problem was the _assumption_ about the file's line-endings. marcello's workaround for this problem was a good one: do a global-replace on the file that changes _non-desired_ line-ending characters into the line-endings that we want. he did that with this line: > $text =~ s/\r//g;?? # fix brain-dead M$-DOS and Mac line endings maybe he didn't realize the line-ending problem right away, since his "brain-dead" comment _might_ indicate frustration... after all, no one newline is "right" or "wrong", any more than driving on the "right" side of the road is the "right" way to do it. over in england, they drive on the left side of the road; that isn't "brain-dead", it's just a _different_ way of doing things, that's all. but this line-ending confusion _is_ a constant hassle, i'll tell you. (and since the mac is in the minority, i have to face it all the time. so i hope you'll excuse me for reversing the problem on all of you.) so, in general, if you'll be dealing with files that you haven't created, and thus don't know what line-endings they use, doing a conversion -- right after you're read any file in -- is the _best_ way to proceed... you _can_ also write your code so it'll be oblivious to line-endings, but sometimes that can get unnecessarily complex and unwieldy... another option -- _if_ you're working with files that you create and maintain, is to standardize the line-ending used in the files. (nonetheless, as a matter of course, the conversion doesn't hurt; at worst, it will do nothing, but you will have the peace of mind.) still another option is to maintain separate versions of each file, one for each line-ending, so you can call in the appropriate one. while i don't typically recommend it (why maintain separate files?) this last option is the one that i'll use for the rest of this exercise. my code will load a file named "myant-lf.zml" (_not_ "myant.zml"). (indeed, i snuck it in on the previous exercise, for babelfish06.pl.) however, when i _talk_ about the file, _i_ will still call it "myant.zml". i'm not doing this _deliberately_ to confuse you (ok, maybe a little), the purpose is to serve as a constant reminder of this little irritation so you don't let yourself -- or your routines -- get tripped up by it. it's also a general warning that program routines are quite literal; if you _tell_ them to look for a line-feed, they'll look for a line-feed, even if what you _really_ wanted was for them to look for a newline (whether that be a carriage-return or a line-feed or a combination). the lesson is that it's very important for you to be very precise when telling routines what you want them to do. they'll do what you _say_, provided you say it correctly, and not necessarily what you _mean_... now hey, maybe you non-programmers are thinking to yourselves, "didn't he tell us that he wouldn't bog us down in program-speak?" yes, i did, and i'm sorry if this has strayed a little too close to the line, but the lesson applies to you guys too, not just to us programmers... when you tell us the features that you want, you need to be specific! if you can specify -- in terms of the actual reality of the file as is -- how to obtain a feature that you want, that will almost _guarantee_ that you will get that feature. *** ok, so our next assignment is to split the text into sections. so here we go with assignment #7: output the 21st section. > ####################### output the 21st section > @onesection=split("\n\n\n\n\n",$thebook); foreach $onesection (@onesection) { > $nn++; if ($nn = "21" ) {print $onesection}}; > # pseudo-code: split into sections and output the 21st you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish07.pl as you can imagine, later on we'll have a need to output sections, so this is an extremely important piece of code. in just a few lines. *** splitting into sections can also help us formulate a "table of contents". the section's title is the first line, so we'll just skim it for each section. assignment #8: let's skim the header off each section, and list them... > ####################### output section-header lines > $thebook=substr($thebook,40); > @onesection=split("\n\n\n\n\n",$thebook); foreach $onesection (@onesection) { > $tit=substr($onesection,0,200); $tit=~s/^\s+//; $ccc++; if ($ccc <= "9") {print "o"}; > @oneline=split("\n",$tit); $nnn=0; foreach $oneline (@oneline) { > $nnn++; if ($nnn eq "1" ) {print "$ccc -- $oneline\n"}}} > # pseudo-code: skim the header of each section and output it you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish08.pl ok, that's nice enough -- indeed, it is _tantilizingly_ close -- and that means you can probably guess what we'll want next. **** assignment #9: let's make that "table of contents" into hotlinks... > ####################### output hotlinked table-of-contents > print ""; @thechap=split("\n\n\n\n\n",$thebook); > $pp=1; $past="myantc001"; foreach $thechap (@thechap) { > $nn++; if ($nn ne "0" and $nn ne "1") {if ($pp<10) {print "o"}; > print $pp; $pp++; print " -- "; $printme=substr($thechap,0,200); > if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)}; > if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)}; > if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)}; > if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)}; > if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)}; > $ttt=0; @thetitle=split('\n',$printme); foreach $thetitle (@thetitle) { > $ttt++; if ($ttt eq 1) { > print ''; print substr($thetitle,1); print ''; > print "\n"; $past=substr($thechap,length($thechapter)-55,500); > for ($i=0;$i<=99;$i++) { > if (substr($past,0,5) ne "myant") {$past=substr($past,1)}; > if (substr($past,0,5) eq "myant") {$i=99}; > } $past=substr($past,0,9);}}}} > # pseudo-code: split into sections, strip headers, and output you can see the results of this code by running the new script: > http://www.greatamericannovel.com/scgi-bin/babelfish09.pl wow. now we're talking. a completely _hotlinked_ table-of-contents for this book, executed in just a dozen-and-a-half lines of hack-something-together code. we can see this kind of thing being _useful_. and we've only just started. wow. *** so let's start to wrap it up for today, ok? before i go, i'd like to point you to one file in the "my antonia" set: > http://www.greatamericannovel.com/myant/myantp123.html this file "validates", which i'm sure will make marcello very happy. indeed, i even put in a link -- at the very _bottom_ of the page -- that submits the page to the validator to save him that little chore. just click on "make my day", marcello, for your precious validation. not only that, but i took out the "font" tag to make david happy too, since he was concerned, as that tag is now "deprecated". (oh no!) of course, since i'm now using a header tag -- "h6", to be exact -- for the pagenumber, david might start whining about "tag abuse". just goes to show how hard it is to make obsessives happy. :+) so let me put out this general call: if anyone here wants to make a template for our little open-source project here, please feel free! make it as totally-standards-compliant as your little heart desires! don't leave it up to me, folks. as long as something _works_, i'll be happy with it, no matter what the standards mafia says. so if you want to save the world from non-standardized markup, you'll need to step up with a solution. or i'm deaf to your whining. in terms of what i'd _like_, though, here's what i can tell you... first, i'd like to be able to lay in a background -- like this one -- > http://snowy.arsc.alaska.edu/bowerbird/misc/goodbook.jpg that will _resize_itself_automatically_ to the window's exact size. i don't know how to do this using .css, even though the need for such a capability would appear to be totally obvious. (it is to me.) next, i'd like to have a 2-column layout where i can flow text in each column separately. (this i can pretty much do already, although back when i was working on it, i seem to remember there were some little niceties i wanted to introduce into it, and now i can't remember what they were.) also nice would be if you can have the font-size increase until it fills its column, so it grows as big as it can possibly be, while all of it remains in the window (that is, so the end-user doesn't have to scroll down to see it). one more thing; if you know a way to force-justify, it'd be nice. i can do all of these things in my offline apps, and i'd like to have the server-side version look just as nice as those apps. thanks for your contribution to our open-source gutvol-d project! *** ok, here's an "executive summary" of the exercises we did today; the first line tells us the assignment, the second gives us the u.r.l. > ####################### output file after reading it in > http://www.greatamericannovel.com/scgi-bin/babelfish01.pl > ####################### output 3 predetermined pages > http://www.greatamericannovel.com/scgi-bin/babelfish02.pl > ####################### output pages 111, 222, 333 > http://www.greatamericannovel.com/scgi-bin/babelfish03.pl > ####################### output pages with "breeze" > http://www.greatamericannovel.com/scgi-bin/babelfish04.pl > ####################### output words #75319-75357 > http://www.greatamericannovel.com/scgi-bin/babelfish05.pl > ####################### output all lines, numbered > http://www.greatamericannovel.com/scgi-bin/babelfish06.pl > ####################### output section number 22 > http://www.greatamericannovel.com/scgi-bin/babelfish07.pl > ####################### output header of each section > http://www.greatamericannovel.com/scgi-bin/babelfish08.pl > ####################### output hotlinked table-of-contents > http://www.greatamericannovel.com/scgi-bin/babelfish09.pl at this time, it's probably worth a step back to look at the big picture. the ability to take an e-text file and spit out its pages of text, and to do searches on it, and to provide a hotlinked table-of-contents page, is well-and-good for _one_ book. but that is not all we have here... no, what we have here is a tool that can enable us to do these things for _20,000_ e-texts. and we made it in a matter of _mere_minutes_ (ok, many hours for me to write it up, but you can copy it in seconds), with fewer than 100 lines of code, copied out of a primer on perl... think about that. and we've only just begun. so ok, now who wants to port this code into python, or php, or ruby? c'mon, don't be shy, that's what open-source projects are all about... anyway, that's a good day's work. there'll be more in coming days... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/11641e48/attachment-0001.html From desrod at gnu-designs.com Mon Oct 30 14:43:32 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Mon Oct 30 14:44:31 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here In-Reply-To: References: Message-ID: <1162248212.5857.1.camel@localhost.localdomain> On Mon, 2006-10-30 at 16:57 -0500, Bowerbird@aol.com wrote: > no, what we have here is a tool that can enable us to do these things > for _20,000_ e-texts. and we made it in a matter of _mere_minutes_ > (ok, many hours for me to write it up, but you can copy it in > seconds), with fewer than 100 lines of code, copied out of a primer on > perl... Its obvious from reading the snippets, that it is indeed copied out of a rudimentary Perl primer, and not touched by anyone who has a strong grasp of the power of the language at hand. Exactly what is it you are trying to prove with this anyway? We know how to write parsers that can chew up and spit out a Gutenberg etext into other formats, I don't think that's the core of the problem here. -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/3a5a7431/attachment.bin From Bowerbird at aol.com Mon Oct 30 17:07:43 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Oct 30 17:07:49 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here Message-ID: david said: > Exactly what is it you are trying to prove with this anyway? sorry, the time for the endless listserve merry-go-round is done. it was a fun run, wish you woulda been here for the whole thing. but it's pudding time now. if you want to put your questions on a publicly-editable wiki somewhere, where we can refer future questioners, so these topics don't get raised over and over as a mere _stalling_ tactic, then i'll be happy to answer them, there. but right here, right now, it's straight-ahead only. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/50356ec0/attachment.html From marcello at perathoner.de Tue Oct 31 08:52:13 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Oct 31 08:52:18 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here In-Reply-To: <1162248212.5857.1.camel@localhost.localdomain> References: <1162248212.5857.1.camel@localhost.localdomain> Message-ID: <45477F3D.70205@perathoner.de> David A. Desrosiers wrote: > Its obvious from reading the snippets, that it is indeed copied out of a > rudimentary Perl primer, and not touched by anyone who has a strong > grasp of the power of the language at hand. He's a baby that makes poo in the chamberpot for the first time and thinks his parents are watching him because they want poo. > Exactly what is it you are trying to prove with this anyway? We know how > to write parsers that can chew up and spit out a Gutenberg etext into > other formats, I don't think that's the core of the problem here. He's just inventing warm water (and trying to get credit for it). This parser is online. It converts any PG text into a plucker database. And it is open source and written in gasp! python. We have served 130,000 plucker texts in October this way. The only guy who hasn't noticed yet is him who notices everything. There are a few other PG parsers around like GutenMark and my PG to TEI converter. All of them are open source and working today. So its only natural that you-know-who will hold his non-working at-the-rate-its-going-never-to-be-released zml parser against them, just for the fun of causing confusion. Ever wondered who pays him to fuzz and fudge? -- Marcello Perathoner webmaster@gutenberg.org From desrod at gnu-designs.com Tue Oct 31 09:28:49 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Tue Oct 31 09:29:35 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here In-Reply-To: <45477F3D.70205@perathoner.de> References: <1162248212.5857.1.camel@localhost.localdomain> <45477F3D.70205@perathoner.de> Message-ID: <1162315729.5921.0.camel@localhost.localdomain> On Tue, 2006-10-31 at 17:52 +0100, Marcello Perathoner wrote: > This parser is online. It converts any PG text into a plucker > database. And it is open source and written in gasp! python. We have > served 130,000 plucker texts in October this way. Plucker? I've heard of that application... -- David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/c7def110/attachment.bin From marcello at perathoner.de Tue Oct 31 11:34:46 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Oct 31 11:34:49 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here In-Reply-To: References: Message-ID: <4547A556.5000001@perathoner.de> Bowerbird@aol.com wrote: > hi. this is my one post for 2006-october-30rd. This is just about programming. Why don't you post this to the appropriate list? mailto: gutvol-p@lists.pglaf.org -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Tue Oct 31 12:48:58 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 31 12:49:11 2006 Subject: [gutvol-d] gvd061030 -- let's get it started in here Message-ID: i told you there'd be flack... :+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/682d3da5/attachment.html From Bowerbird at aol.com Tue Oct 31 12:51:54 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Oct 31 12:52:04 2006 Subject: [gutvol-d] gvd061031 -- any more reaction to duguid? Message-ID: so is there any more reaction to the duguid article? i wanna make sure everyone has had a chance to say what they think before i tell you what i think... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/0b3b06a6/attachment.html