From jon.ingram at gmail.com  Sun Oct  1 03:40:51 2006
From: jon.ingram at gmail.com (Jon Ingram)
Date: Sun Oct  1 03:41:03 2006
Subject: [gutvol-d] PG Examples of XHTML and CSS?
In-Reply-To: <000001c6e507$f9bb6ee0$1f12fea9@sarek>
References: <000001c6e507$f9bb6ee0$1f12fea9@sarek>
Message-ID: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>

On 10/1/06, John Hagerson <j.hagerson@comcast.net> wrote:
> If I remember correctly, someone was creating PG texts using CSS and XHTML,
> but I don't remember who it was. I would like to see an example that uses
> these technologies. The W3.org website has all of the information, but
> sometimes it's like trying to find a needle in a haystack to find the answer
> to a specific question.
>
> If someone could provide the name of the poster or an e-book number, that
> would be very helpful. Thank you.

Many of the books processed by the DP site in the last few years have
had an XHTML version created. We even have very rough guidelines for
the marking up of things like poetry and page numbers, although
there's a lot of variation between individual projects.

'Uberprojects' like periodicals often have a style-guide which is
followed by almost all the posted issues. You could take a look at
individual issues to see which styles you like (or dislike). Here's a
random Punch issue:
    http://www.gutenberg.org/etext/17397
And a random Scientific American issue:
    http://www.gutenberg.org/etext/11649

Everyone will have their favourite example of HTML/XHTML texts on PG.
Personally I've been very impressed with some of the work that people
have done on books I've scanned (which for some reason means that my
name goes on the PG 'Produced by' line before them, which isn't a
particularly fair reflection on the amount of work put in). Take a
look for example at

Tintinnalogia, or, the Art of Ringing, by Fabian Stedman
    http://www.gutenberg.org/etext/18567

Amusements in Mathematics, by Henry Dudeney
    http://www.gutenberg.org/etext/16713

The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken
    http://www.gutenberg.org/etext/13645

If you give more information about what particularly you're looking
for, I might be able to be a bit more selective rather than throwing
out random links to books I like!

-- 
Jon Ingram
From sly at victoria.tc.ca  Sun Oct  1 09:23:31 2006
From: sly at victoria.tc.ca (Andrew Sly)
Date: Sun Oct  1 09:23:36 2006
Subject: [gutvol-d] PG Examples of XHTML and CSS?
In-Reply-To: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>
References: <000001c6e507$f9bb6ee0$1f12fea9@sarek>
	<4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>
Message-ID: <Pine.GSO.4.58.0610010916450.1455@vtn1.victoria.tc.ca>


Also, you might want to check out the experiences of DP
volunteers in preparing html. I believe the general
consesus has been that there is enough variation of
needs between different projects, that trying to define
one strict standard does not work.

But some general guidelines have emerged. Start at the page:
http://www.pgdp.net/wiki/HTML

That includes a link to a "CSS bookbook" that you might
find to be of interest.

Andrew

On Sun, 1 Oct 2006, Jon Ingram wrote:

> On 10/1/06, John Hagerson <j.hagerson@comcast.net> wrote:
> > If I remember correctly, someone was creating PG texts using CSS and XHTML,
> > but I don't remember who it was. I would like to see an example that uses
> > these technologies. The W3.org website has all of the information, but
> > sometimes it's like trying to find a needle in a haystack to find the answer
> > to a specific question.
> >
> > If someone could provide the name of the poster or an e-book number, that
> > would be very helpful. Thank you.
>
> Many of the books processed by the DP site in the last few years have
> had an XHTML version created. We even have very rough guidelines for
> the marking up of things like poetry and page numbers, although
> there's a lot of variation between individual projects.
>
> 'Uberprojects' like periodicals often have a style-guide which is
> followed by almost all the posted issues. You could take a look at
> individual issues to see which styles you like (or dislike). Here's a
> random Punch issue:
>     http://www.gutenberg.org/etext/17397
> And a random Scientific American issue:
>     http://www.gutenberg.org/etext/11649
>
> Everyone will have their favourite example of HTML/XHTML texts on PG.
> Personally I've been very impressed with some of the work that people
> have done on books I've scanned (which for some reason means that my
> name goes on the PG 'Produced by' line before them, which isn't a
> particularly fair reflection on the amount of work put in). Take a
> look for example at
>
> Tintinnalogia, or, the Art of Ringing, by Fabian Stedman
>     http://www.gutenberg.org/etext/18567
>
> Amusements in Mathematics, by Henry Dudeney
>     http://www.gutenberg.org/etext/16713
>
> The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken
>     http://www.gutenberg.org/etext/13645
>
> If you give more information about what particularly you're looking
> for, I might be able to be a bit more selective rather than throwing
> out random links to books I like!
>
>
From jon at noring.name  Sun Oct  1 11:59:10 2006
From: jon at noring.name (Jon Noring)
Date: Sun Oct  1 11:59:22 2006
Subject: [gutvol-d] Alternate CSS style sheets for "My Antonia" requested
	(was: PG Examples of XHTML and CSS?)
In-Reply-To: <4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>
References: <000001c6e507$f9bb6ee0$1f12fea9@sarek>
	<4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>
Message-ID: <1108209902.20061001125910@noring.name>

John Hagerson wrote:

> If I remember correctly, someone was creating PG texts using CSS and XHTML,
> but I don't remember who it was. I would like to see an example that uses
> these technologies. The W3.org website has all of the information, but
> sometimes it's like trying to find a needle in a haystack to find the answer
> to a specific question.
>
> If someone could provide the name of the poster or an e-book number, that
> would be very helpful. Thank you.

I've placed online the book "My Antonia" by Willa Cather. It is valid
to XHTML 1.1 with three different CSS style sheet options (and a
version with no style sheet applied -- only browser defaults used):

   http://www.openreader.org/myantonia

Jose Menendez has an HTML 4.01 version of the same book (and my
version essentially relied on his for final proofing to catch the last
remaining transcription errors) with an internal CSS style sheet:

   http://www.ibiblio.org/ebooks/Cather/

(I'm not sure, but Jose may have donated his text version to PG. Our
versions differ in that mine is faithful to the original 1918 edition
including some text errors found in the original printing -- Jose's
is a corrected "reader" edition. But my edition does include markup
to flag the text errors and provide what the correct text should be
per Jose's corrections, plus a few listed at the UNL Cather site.)

*****

Now, my text layout skills are downright pitiful, and anyone wishing
to submit alternative CSS style sheets for my version of "My Antonia"
is welcome to do so -- the more the merrier (every person submitting
CSS will be acknowledged.) I believe the markup has sufficient
structural and semantic granularity to do some pretty advanced CSS
presentation.

Jon Noring


From j.hagerson at comcast.net  Sun Oct  1 13:01:23 2006
From: j.hagerson at comcast.net (John Hagerson)
Date: Sun Oct  1 13:01:34 2006
Subject: [gutvol-d] PG Examples of XHTML and CSS?
In-Reply-To: <Pine.GSO.4.58.0610010916450.1455@vtn1.victoria.tc.ca>
Message-ID: <000001c6e594$5f012970$1f12fea9@sarek>

Thank you very much Jon and Andrew. Between the samples listed, the
cookbook, and the other resources noted on the PG wiki, I think I will be
able to mark up the text I'm working on. I need to think more of semantic
tags rather than presentation tags. There is a gestalt to this that I
haven't quite mastered.


From jon at noring.name  Sun Oct  1 13:49:59 2006
From: jon at noring.name (Jon Noring)
Date: Sun Oct  1 13:50:12 2006
Subject: [gutvol-d] PG Examples of XHTML and CSS?
In-Reply-To: <000001c6e594$5f012970$1f12fea9@sarek>
References: <Pine.GSO.4.58.0610010916450.1455@vtn1.victoria.tc.ca>
	<000001c6e594$5f012970$1f12fea9@sarek>
Message-ID: <1511133066.20061001144959@noring.name>

John Hagerson wrote:

> Thank you very much Jon and Andrew. Between the samples listed, the
> cookbook, and the other resources noted on the PG wiki, I think I will be
> able to mark up the text I'm working on. I need to think more of semantic
> tags rather than presentation tags. There is a gestalt to this that I
> haven't quite mastered.

Glad to have been of help.

My call for alternate style sheets for my version of "My Antonia" is
possible only because the markup is strictly structural/semantic. Had
I done old-fashioned HTML markup (where I mix in presentational tags
along with the structural/semantic tags), it is no longer possible to
have the flexibility of presentation. (It's also important NOT to use
tables for layout purposes.)

An interesting site which demonstrates the full power of CSS and the
separation of presentation from structure is the CSS Zen Garden site:

   http://www.csszengarden.com/

where the same XHTML 1.0 Strict document (well essentially the same
with respect to structural/semantic markup) is presented in hundreds of
different ways solely by swapping the CSS style sheet (background
images are also customized and applied using CSS). It's amazing
what can be done with CSS applied to purely structural/semantic markup.

Another important aspect of having structural/semantic-only markup is
accessibility. Such documents have a high degree of accessibility
(again, it is important NOT to use table markup for layout purposes if
one wants maximal accessibility -- CSS Zen Garden shows that tables
are not necessary for complex layouts.)

A while back I did some XHTML markup on the "We Media" document for
JD Lasica and the OurMedia project. I asked the CSS authoring
community for alternate CSS style sheets for that document. Two people
supplied CSS:

   http://www.openreader.org/wemedia/

(I like Bob's a little better. Note how readable the document is even
without CSS, which is accomplished by proper XHTML markup.)


Jon Noring

From Bowerbird at aol.com  Sun Oct  1 14:51:46 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sun Oct  1 14:51:53 2006
Subject: [gutvol-d] a 6-year-old
Message-ID: <591.2094c8ac.32519272@aol.com>


distributed proofreaders is 6 years old today.   happy birthday!

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061001/af4db866/attachment.html
From schultzk at uni-trier.de  Sun Oct  1 23:54:56 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Sun Oct  1 23:55:02 2006
Subject: [gutvol-d] oh geez
In-Reply-To: <517.7d6a5e4.324ebb23@aol.com>
References: <517.7d6a5e4.324ebb23@aol.com>
Message-ID: <6EF33CA4-2593-4C3C-912E-C83AC1CBD081@uni-trier.de>

Hi Bowerbird,

	Thanx for the refresher course, but that was not my point.
	I AGREE with you fully mark-up is a pain in the old behind.

		Keith.

Am 29.09.2006 um 20:08 schrieb Bowerbird@aol.com:

> keith said:
> >   I doubt that very much!!
> >   Mark-up is a necessity of
> >   language and communication
> >   wether you see it or not.
>
> zen markup language
> _is_ a form of "markup",
> but it's the "light" kind --
> not that _heavy_ stuff --
> so it doesn't take much
> time or money or energy
> to "apply" it where needed.
>
> as to "whether you see it or not",
> z.m.l. generally tries to be invisible.

> [snip, snip rest deleted for brevity]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/545909d1/attachment.html
From Bowerbird at aol.com  Mon Oct  2 00:26:50 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct  2 00:26:59 2006
Subject: [gutvol-d] oh geez
Message-ID: <51f.7c17ac2.3252193a@aol.com>

keith said:
>    I AGREE with you fully mark-up is a pain in the old behind.

that's why -- when they realize markup is _also_ unnecessary --
people will leave it behind, immediately, like a bad housemate,
and be relieved to be done with it, and swear never to go back...        :+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/5c8bee3d/attachment.html
From hyphen at hyphenologist.co.uk  Mon Oct  2 02:21:40 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Mon Oct  2 02:21:53 2006
Subject: [gutvol-d] oh geez
In-Reply-To: <51f.7c17ac2.3252193a@aol.com>
References: <51f.7c17ac2.3252193a@aol.com>
Message-ID: <uam1i2doa2ukt23nnlnm0tn29m512b0e0t@4ax.com>

On Mon, 2 Oct 2006 03:26:50 EDT,  Bowerbird@aol.com wrote:

|keith said:
|>    I AGREE with you fully mark-up is a pain in the old behind.
|
|that's why -- when they realize markup is _also_ unnecessary --
|people will leave it behind, immediately, like a bad housemate,
|and be relieved to be done with it, and swear never to go back...        :+)


I left Mark Up behind way back in 1985.
For html I rely on NVU for html which is WISYWIG.
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From nwolcott2ster at gmail.com  Mon Oct  2 08:01:35 2006
From: nwolcott2ster at gmail.com (Norm Wolcott)
Date: Mon Oct  2 08:09:06 2006
Subject: [gutvol-d] PG Examples of XHTML and CSS?
References: <000001c6e507$f9bb6ee0$1f12fea9@sarek><4baf53720610010340m563643a3vaba8b58b1f3cce94@mail.gmail.com>
	<Pine.GSO.4.58.0610010916450.1455@vtn1.victoria.tc.ca>
Message-ID: <003c01c6e633$c07d5e40$640fa8c0@atlanticbb.net>

If you care to go through the pain of installing Guiguts

http://www.pgdp.net/wiki/PPTools/Guiguts

There is an option to create a Css-XHTML web page from a text version made
to PG standards.

The HTML/CSS is quite generic and can be tweaked for individual needs. Since
DP may take a couple of years to process a book now, doing your own thing
may again be an option.


nwolcott2@post.harvard.edu
----- Original Message -----
From: "Andrew Sly" <sly@victoria.tc.ca>
To: "Project Gutenberg Volunteer Discussion" <gutvol-d@pglaf.org>
Sent: Sunday, October 01, 2006 12:23 PM
Subject: Re: [gutvol-d] PG Examples of XHTML and CSS?


>
> Also, you might want to check out the experiences of DP
> volunteers in preparing html. I believe the general
> consesus has been that there is enough variation of
> needs between different projects, that trying to define
> one strict standard does not work.
>
> But some general guidelines have emerged. Start at the page:
> http://www.pgdp.net/wiki/HTML
>
> That includes a link to a "CSS bookbook" that you might
> find to be of interest.
>
> Andrew
>
> On Sun, 1 Oct 2006, Jon Ingram wrote:
>
> > On 10/1/06, John Hagerson <j.hagerson@comcast.net> wrote:
> > > If I remember correctly, someone was creating PG texts using CSS and
XHTML,
> > > but I don't remember who it was. I would like to see an example that
uses
> > > these technologies. The W3.org website has all of the information, but
> > > sometimes it's like trying to find a needle in a haystack to find the
answer
> > > to a specific question.
> > >
> > > If someone could provide the name of the poster or an e-book number,
that
> > > would be very helpful. Thank you.
> >
> > Many of the books processed by the DP site in the last few years have
> > had an XHTML version created. We even have very rough guidelines for
> > the marking up of things like poetry and page numbers, although
> > there's a lot of variation between individual projects.
> >
> > 'Uberprojects' like periodicals often have a style-guide which is
> > followed by almost all the posted issues. You could take a look at
> > individual issues to see which styles you like (or dislike). Here's a
> > random Punch issue:
> >     http://www.gutenberg.org/etext/17397
> > And a random Scientific American issue:
> >     http://www.gutenberg.org/etext/11649
> >
> > Everyone will have their favourite example of HTML/XHTML texts on PG.
> > Personally I've been very impressed with some of the work that people
> > have done on books I've scanned (which for some reason means that my
> > name goes on the PG 'Produced by' line before them, which isn't a
> > particularly fair reflection on the amount of work put in). Take a
> > look for example at
> >
> > Tintinnalogia, or, the Art of Ringing, by Fabian Stedman
> >     http://www.gutenberg.org/etext/18567
> >
> > Amusements in Mathematics, by Henry Dudeney
> >     http://www.gutenberg.org/etext/16713
> >
> > The Tatler, Volume 1, by Richard Steele et al., ed. George Aitken
> >     http://www.gutenberg.org/etext/13645
> >
> > If you give more information about what particularly you're looking
> > for, I might be able to be a bit more selective rather than throwing
> > out random links to books I like!
> >
> >
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From nwolcott2ster at gmail.com  Mon Oct  2 08:13:41 2006
From: nwolcott2ster at gmail.com (Norm Wolcott)
Date: Mon Oct  2 08:27:56 2006
Subject: [gutvol-d] Scraping text from Univ Mich googles
Message-ID: <008701c6e637$463b02a0$640fa8c0@atlanticbb.net>

Is there any way, other than a page by page scraping of the html from the text images provided by UMich for their google books--to get the whole text in one file, or thereabouts. 

The other question is does re-OCR'ing the page images give any better results than starting with the page texts given by google? Have any of the other google participants seen fit to privide the google text? 

nwolcott2@post.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/4ac42834/attachment.html
From hart at pglaf.org  Mon Oct  2 09:47:06 2006
From: hart at pglaf.org (Michael Hart)
Date: Mon Oct  2 09:47:07 2006
Subject: [gutvol-d] oh geez, part 2
In-Reply-To: <c3c.4b1c2b5.32502074@aol.com>
References: <c3c.4b1c2b5.32502074@aol.com>
Message-ID: <Pine.LNX.4.60.0610020945350.11735@pglaf.org>


Once you find yourself sucked down into the mud,
you'll find they they enjoy it, are practiced at it,
and that the only way to beat them is to become one of them,
which is total defeat.


On Sat, 30 Sep 2006 Bowerbird@aol.com wrote:

> jon said:
>>   Am I the only one to see who
>>    is really slinging the mud here?
>
> i did not resort to anything ad hominem.
>
> i said unflattering things, yep i sure did!,
> but if any of them do not jibe with reality,
> then by all means feel free to express that.
> if i agree that i overstepped appropriateness,
> i will be more than happy to issue an apology.
>
> while i talk about the issues,
> david attacks me _personally_,
> (and strays from the truth too),
> instead of addressing my points.
> that's my definition of mudslinging.
>
> anyway, i rarely mention david at all,
> and probably shouldn't have gone on
> after that initial post, but somebody
> _did_ ask.   (and i think that i was fair
> by advising him to read david's blog
> and make up his own mind about it.)
>
> -bowerbird
>
From Bowerbird at aol.com  Mon Oct  2 09:55:02 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct  2 09:55:08 2006
Subject: [gutvol-d] re: Scraping text from Univ Mich googles
Message-ID: <c0c.5c6f900.32529e66@aol.com>

norm said:
>    Is there any way, other than a page by page scraping of the html 
>    from the text images provided by UMich for their google books--
>    to get the whole text in one file, or thereabouts.

i've written a scraper-program, yeah.

i'm reluctant to release it to the public, because in the hands of
the wrong person, it could really piss off umichigan, to the point
of making them reconsider their decision to release the o.c.r. text.

but i'd be happy to send it to _you_, norm,
and anyone who has proven by their actions
that they're dedicated to the cause of e-books,
and willing to do the work of digitizing books...

for those who are interested in scraping text from umichigan,
you might wanna read a series of posts i've been making to
the bookpeople listserve, on digitizing umichigan o.c.r. text:
>    http://onlinebooks.library.upenn.edu/webbin/bparchive
search for "feedback to umichigan" to find the series...


>    The other question is does re-OCR'ing the page images 
>    give any better results than starting with the page texts 
>    given by google? 

it just might.   i haven't done any kind of inventory yet,
but the o.c.r. text for the one book i'm doing for that
series of posts is badly flawed.   it's missing much info,
including paragraphing, text-styling, text-indentation,
and even the hyphens on the end-of-line hyphenates,
so it's been an unnecessarily hard job to babysit the text.

nonetheless, i'm still on-track to digitize the entire book in 
just one hour, and i've documented each task meticulously,
so you can make up your own mind on how you'd proceed.


>    Have any of the other google participants 
>    seen fit to privide the google text?

not yet, as far as i know, but i hope they all will eventually.

as you'll see from the umichigan text, however, we cannot
count on the interface to be even minimally desirable, so
it's gonna be necessary to scrape that content so that we
can provide it to people in a form with acceptable usability.


>    Since DP may take a couple of years to process a book now, 
>    doing your own thing may again be an option.

i think the choice of one hour of work versus months of waiting
for a book to come out of d.p. is a choice with stark perspective.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061002/797bd89f/attachment.html
From hart at pglaf.org  Mon Oct  2 09:59:34 2006
From: hart at pglaf.org (Michael Hart)
Date: Mon Oct  2 09:59:36 2006
Subject: [gutvol-d] oh geez, part 2
In-Reply-To: <c42.41f810b.324ed2f6@aol.com>
References: <c42.41f810b.324ed2f6@aol.com>
Message-ID: <Pine.LNX.4.60.0610020957380.11735@pglaf.org>


Well, since I have some friends who already have one,
and assure me that it is so much of a dog that you
should be prepared we flea rememdies, I would have
to go along with David Rothman in this instance,
though I usually take the same precautions with him.

Michael


On Fri, 29 Sep 2006 Bowerbird@aol.com wrote:

> david rothman is advising people
> not to get caught up in the hype
> over the "forthcoming" sony reader.
>
> your lesson is irony is over for today.
>
> -bowerbird
>
From gbnewby at pglaf.org  Mon Oct  2 15:19:16 2006
From: gbnewby at pglaf.org (Greg Newby)
Date: Mon Oct  2 15:19:18 2006
Subject: [gutvol-d] Make a video, get a thumper
Message-ID: <20061002221916.GA20599@pglaf.org>

The new Sun x4500 server (previously known as the "thumper")
is a 24TB file server (twenty-four terabytes).  They have
a new offer out where those who make a video can win one.

This would be ideal for PG to do a massive public collection
of collections, metadata, etc.  I can think of several 
ideas along that theme....  but there's close to zero chance
I can make a video anytime soon.

If you might be interested, take a look:
  http://sunflash.sun.com/articles/103/4/promos/17052

  -- Greg

From traverso at dm.unipi.it  Tue Oct  3 02:33:56 2006
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Tue Oct  3 02:54:54 2006
Subject: [gutvol-d] Make a video, get a thumper
In-Reply-To: <20061002221916.GA20599@pglaf.org> (message from Greg Newby on
	Mon, 2 Oct 2006 15:19:16 -0700)
References: <20061002221916.GA20599@pglaf.org>
Message-ID: <200610030933.k939XudT028369@posso9.dm.unipi.it>

>>>>> "Greg" == Greg Newby <gbnewby@pglaf.org> writes:

    Greg> The new Sun x4500 server (previously known as the "thumper")
    Greg> is a 24TB file server (twenty-four terabytes).  They have a
    Greg> new offer out where those who make a video can win one.

    Greg> This would be ideal for PG to do a massive public collection
    Greg> of collections, metadata, etc.  I can think of several ideas
    Greg> along that theme....  but there's close to zero chance I can
    Greg> make a video anytime soon.

    Greg> If you might be interested, take a look:
    Greg> http://sunflash.sun.com/articles/103/4/promos/17052

Interesting, especially the 10-rack with 240TB for $470,995.00

The entry level, 12TB for $32,995.00 might be affordable for PG,
with a fund-raising campaign supported by a clear use project. Sun
itself might contribute with a substantial discount.

Carlo


From Bowerbird at aol.com  Fri Oct  6 14:58:31 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct  6 14:58:44 2006
Subject: [gutvol-d] re: Scraping text from Univ Mich googles
Message-ID: <489.57ec8658.32582b87@aol.com>

just wanted to give y'all an update on this...

it ends up the o.c.r. text from the university of michigan is
quite worthless, so bad there's no use in even scraping it...

almost all of it is lacking quote-marks and em-dashes and
the hyphens from end-of-line hyphenates, and paragraphs
and text-styling and text-indentation too, so it's more work
in most books to restore all that than to do the o.c.r. yourself.

in fact, it'd probably be better to type a book from scratch
than try to deal with this ugly o.c.r., because at least with
a type-in, you can actually follow the narrative of the book.

so i guess you'd have to say the umichigan o.c.r. is actually
_worse_ than worthless.   me, i'd be _embarrassed_ to post it
in a public place, let alone offer it to a university community.
but hey, maybe that's just _me_, know what i mean?

so -- at least at this point in time -- michael hart was right
that the google project wouldn't give us good digital text...

of course, i was _also_ right, when i said that we should be
willing to create good digital text ourselves, from the scans.
and that still holds true...

but -- at least so far -- i was wrong when i predicted that
we would be given highly-proofed text from the project...

so there you have it, michael.   i was wrong.   you were right.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061006/1b7980b8/attachment.html
From Bowerbird at aol.com  Sat Oct  7 11:43:37 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 11:43:48 2006
Subject: [gutvol-d] worse than worthless
Message-ID: <ca8.286e0b.32594f59@aol.com>

ok, when i said the umichigan o.c.r. is "worse than worthless",
that was a rather unflattering description, wasn't it?   yes it was.

but judge for yourself whether it's fair, with these 2 verne works:

>    http://snowy.arsc.alaska.edu/bowerbird/misc/eighty.txt
>    http://snowy.arsc.alaska.edu/bowerbird/misc/china.txt

the lost hyphenation and paragraphing can be restored automatically,
in most cases, so doesn't have to entail _that_ much work (but some)...

the lost quote-marks, however, are a _ton_ of work to reinstate.

likewise the em-dashes (although there usually aren't too many)
and text-styling and formatting (which vary from book-to-book).

i'd think it's nearly impossible to write routines to automate all that.
(i am not even slightly inclined to take it on as a difficult challenge.)

moreover, since if you do the o.c.r. _correctly_, you can avoid all this
unnecessary work, and since batch o.c.r. only takes a few minutes to
set up, there's _no_ reason to waste your time with umichigan o.c.r.

your o.c.r. program will pay for itself in no time, and you will be
_considerably_ less frustrated, which is worth a lot all by itself...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/7a9dfc8c/attachment.html
From Bowerbird at aol.com  Sat Oct  7 12:18:37 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 12:18:43 2006
Subject: [gutvol-d] re: worse than worthless
Message-ID: <54b.849850a.3259578d@aol.com>

i forgot to mention that, in addition to all the missing data,
the _recognition_ itself on the "80 days" book is atrocious...

if you want a good laugh about how bad o.c.r. can get,
that's one example that will give it to you...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/3044582c/attachment.html
From Bowerbird at aol.com  Sat Oct  7 12:50:10 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 12:50:17 2006
Subject: [gutvol-d] more amusement
Message-ID: <c19.6678b29.32595ef2@aol.com>

so i went to google to get the scans for the "chinaman" book
by jules verne, and was amused to discover their pagenumbers
are off by 2, which means that this page right here...

>    http://books.google.com/books?vid=LCCN01009859&id=-82QXfOrkwAC&pg=PP11&
dq=%22The+tribulations+of+a+Chinaman+in+China%22&as_brr=1

has all the links that were meant for this contents page...

>    http://books.google.com/books?vid=LCCN01009859&id=-82QXfOrkwAC&pg=PP9&
dq=%22The+tribulations+of+a+Chinaman+in+China%22&as_brr=1

and that, when you search for terms, the yellow highlighting
is on the wrong page.   this is the kind of comical b.s. you get
when your filenames don't include pagenumbers at their core,
and you end up tripping all over your "metadata pointers"...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/2551e3b5/attachment.html
From Bowerbird at aol.com  Sat Oct  7 14:31:06 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 14:31:14 2006
Subject: [gutvol-d] how do i get to this url?
Message-ID: <54b.84ab814.3259769a@aol.com>

why, when i asked for this url:
>    http://www.gutenberg.org/files/17903/17903-h/17903-h.htm

am i directed to this page?
>    http://www.gutenberg.org/etext/17903

even when i ask for the above link from the overview page,
i am recycled back to the overview page.

i've noticed this same type of bug on other e-texts as well...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/82195bba/attachment.html
From Bowerbird at aol.com  Sat Oct  7 14:44:49 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 14:45:03 2006
Subject: [gutvol-d] how do i get to this url?
Message-ID: <bd3.569ffcd.325979d1@aol.com>

i said:
>    why, when i asked for this url:

i forgot to say the bug doesn't happen in all of my browsers,
just my main one (camino, the newest version, under o.s.x.)...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/cf525030/attachment.html
From nwolcott2ster at gmail.com  Sat Oct  7 16:23:26 2006
From: nwolcott2ster at gmail.com (Norm Wolcott)
Date: Sat Oct  7 16:24:37 2006
Subject: [gutvol-d] worse than worthless
References: <ca8.286e0b.32594f59@aol.com>
Message-ID: <007201c6ea67$9d2c6020$640fa8c0@atlanticbb.net>

I would say that the China text is considerably better than the 80Days text. It apparently will vary from book to book. The 80 days book had a very narrow printed page, and so many hyphens which were lost. China does not seem to have many hyphenated lines. In both cases it is necessary to have the book available for good scans for final editing. I also may have lost something in the conversion from utf to iso without doing any converting. 

Also there seems to be much less conversation in this book, making restoring quote marks less of a challenge. 

nwolcott2@post.harvard.edu
  ----- Original Message ----- 
  From: Bowerbird@aol.com 
  To: gutvol-d@lists.pglaf.org ; Bowerbird@aol.com 
  Sent: Saturday, October 07, 2006 2:43 PM
  Subject: [gutvol-d] worse than worthless


  ok, when i said the umichigan o.c.r. is "worse than worthless",
  that was a rather unflattering description, wasn't it?  yes it was.

  but judge for yourself whether it's fair, with these 2 verne works:

  >   http://snowy.arsc.alaska.edu/bowerbird/misc/eighty.txt
  >   http://snowy.arsc.alaska.edu/bowerbird/misc/china.txt

  the lost hyphenation and paragraphing can be restored automatically,
  in most cases, so doesn't have to entail _that_ much work (but some)...

  the lost quote-marks, however, are a _ton_ of work to reinstate.

  likewise the em-dashes (although there usually aren't too many)
  and text-styling and formatting (which vary from book-to-book).

  i'd think it's nearly impossible to write routines to automate all that.
  (i am not even slightly inclined to take it on as a difficult challenge.)

  moreover, since if you do the o.c.r. _correctly_, you can avoid all this
  unnecessary work, and since batch o.c.r. only takes a few minutes to
  set up, there's _no_ reason to waste your time with umichigan o.c.r.

  your o.c.r. program will pay for itself in no time, and you will be
  _considerably_ less frustrated, which is worth a lot all by itself...

  -bowerbird


------------------------------------------------------------------------------


  _______________________________________________
  gutvol-d mailing list
  gutvol-d@lists.pglaf.org
  http://lists.pglaf.org/listinfo.cgi/gutvol-d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/86873699/attachment.html
From Bowerbird at aol.com  Sat Oct  7 18:27:53 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct  7 18:28:02 2006
Subject: [gutvol-d] worse than worthless
Message-ID: <389.cafd767.3259ae19@aol.com>

norm said:
>    China does not seem to have many hyphenated lines. In both cases 
>    it is necessary to have the book available for good scans for final 
editing. 

having spent time working with the o.c.r. from umichigan, 
i can assure you it will be faster for you to re-do the o.c.r.

if you are bound and determined to use their text, though,
i can send you a program that will automatically repair most
of the end-line hyphenates, and restore much paragraphing.
for the rest, though, you're pretty much on your own, sadly...


>    Also there seems to be much less conversation in this book, 
>    making restoring quote marks less of a challenge.

it might seem that way, but i predict you will find your error-rate
is quite high, unacceptably high, not something that you would
want to put your name on, not if you have any sense of pride...

even if this is a rare edition, you won't get much honor by issuing
a flawed digitization out into the world...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061007/0cdb20df/attachment.html
From hyphen at hyphenologist.co.uk  Sat Oct  7 23:45:03 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Sat Oct  7 23:45:17 2006
Subject: [gutvol-d] New Web site problem
Message-ID: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>

The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost
the link to the Project Gutenberg Upload Pages  http://upload.pglaf.org.  I
searched long and hard but failed to find it :-(.    May be there somewhere
but I was forced back onto my copyright clearance email, to get there.

The site may now be Wiki, but if everyone put links where they wanted, the
whole site would rapidly become a mess.  Perhaps someone who understands
the layout of the new site could add it.
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From marcello at perathoner.de  Sun Oct  8 08:22:15 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Sun Oct  8 08:22:20 2006
Subject: [gutvol-d] New Web site problem
In-Reply-To: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>
References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>
Message-ID: <452917A7.60006@perathoner.de>

Dave Fawthrop wrote:

> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost
> the link to the Project Gutenberg Upload Pages  http://upload.pglaf.org.  I
> searched long and hard but failed to find it :-(.    May be there somewhere
> but I was forced back onto my copyright clearance email, to get there.
> 
> The site may now be Wiki, but if everyone put links where they wanted, the
> whole site would rapidly become a mess.  Perhaps someone who understands
> the layout of the new site could add it.

If you really mean wikipedia, maybe you should contact them :-)


There never was a link to upload.pglaf.org on the main page.

Try this page:

http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Catenacci at Ieee.Org  Sun Oct  8 11:18:41 2006
From: Catenacci at Ieee.Org (Onorio Catenacci)
Date: Sun Oct  8 11:18:44 2006
Subject: [gutvol-d] New Web site problem
In-Reply-To: <452917A7.60006@perathoner.de>
References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>
	<452917A7.60006@perathoner.de>
Message-ID: <c26320b80610081118r56b67daese46280a760c6fa40@mail.gmail.com>

On 10/8/06, Marcello Perathoner <marcello@perathoner.de> wrote:
> Dave Fawthrop wrote:
>
> > The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost
> > the link to the Project Gutenberg Upload Pages  http://upload.pglaf.org.  I
> > searched long and hard but failed to find it :-(.    May be there somewhere
> > but I was forced back onto my copyright clearance email, to get there.
> >
> > The site may now be Wiki, but if everyone put links where they wanted, the
> > whole site would rapidly become a mess.  Perhaps someone who understands
> > the layout of the new site could add it.
>
> If you really mean wikipedia, maybe you should contact them :-)
>
>
> There never was a link to upload.pglaf.org on the main page.
>
> Try this page:
>
> http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook
>
>
>

I was wondering why he brought up Wikipedia. :-)

-- 
Onorio
From hyphen at hyphenologist.co.uk  Sun Oct  8 12:50:26 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Sun Oct  8 12:50:39 2006
Subject: [gutvol-d] New Web site problem
In-Reply-To: <452917A7.60006@perathoner.de>
References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>
	<452917A7.60006@perathoner.de>
Message-ID: <p4lii2t57v3bme29lvjt0c0m2djjknkbik@4ax.com>

On Sun, 08 Oct 2006 17:22:15 +0200,  Marcello Perathoner
<marcello@perathoner.de> wrote:

|Dave Fawthrop wrote:
|
|> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost
|> the link to the Project Gutenberg Upload Pages  http://upload.pglaf.org.  I
|> searched long and hard but failed to find it :-(.    May be there somewhere
|> but I was forced back onto my copyright clearance email, to get there.
|> 
|> The site may now be Wiki, but if everyone put links where they wanted, the
|> whole site would rapidly become a mess.  Perhaps someone who understands
|> the layout of the new site could add it.
|
|If you really mean wikipedia, maybe you should contact them :-)

Oops copied the wrong URL

|There never was a link to upload.pglaf.org on the main page.
|
|Try this page:
|
|http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook

Yes but how do you get there from http://www.gutenberg.org/wiki/Main_Page

Got it right this time ;-)

Followed everything from
http://www.gutenberg.org/wiki/Category:Volunteering
and there is nothing there.
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From marcello at perathoner.de  Sun Oct  8 13:48:07 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Sun Oct  8 13:48:12 2006
Subject: [gutvol-d] New Web site problem
In-Reply-To: <p4lii2t57v3bme29lvjt0c0m2djjknkbik@4ax.com>
References: <5p6hi2tduudf3dirm8olab7hs3ljk7l8i4@4ax.com>	<452917A7.60006@perathoner.de>
	<p4lii2t57v3bme29lvjt0c0m2djjknkbik@4ax.com>
Message-ID: <45296407.1010502@perathoner.de>

Dave Fawthrop wrote:
> On Sun, 08 Oct 2006 17:22:15 +0200,  Marcello Perathoner
> <marcello@perathoner.de> wrote:
> 
> |Dave Fawthrop wrote:
> |
> |> The new web site http://en.wikipedia.org/wiki/Main_Page seems to have lost
> |> the link to the Project Gutenberg Upload Pages  http://upload.pglaf.org.  I
> |> searched long and hard but failed to find it :-(.    May be there somewhere
> |> but I was forced back onto my copyright clearance email, to get there.
> |> 
> |> The site may now be Wiki, but if everyone put links where they wanted, the
> |> whole site would rapidly become a mess.  Perhaps someone who understands
> |> the layout of the new site could add it.
> |
> |If you really mean wikipedia, maybe you should contact them :-)
> 
> Oops copied the wrong URL
> 
> |There never was a link to upload.pglaf.org on the main page.
> |
> |Try this page:
> |
> |http://www.gutenberg.org/wiki/Gutenberg:Public_Domain_eBook_Submission_How-To#Where_to_Submit_the_eBook
> 
> Yes but how do you get there from http://www.gutenberg.org/wiki/Main_Page
> 
> Got it right this time ;-)
> 
> Followed everything from
> http://www.gutenberg.org/wiki/Category:Volunteering
> and there is nothing there.


Either use your browsers search function to search for "submit" on the
main page,

or enter "submit" into the "search site" box on the main page and click
on "Search Site",

or go to the How-To Category:

  http://www.gutenberg.org/wiki/Category:How-To


-- 
Marcello Perathoner
webmaster@gutenberg.org

From schultzk at uni-trier.de  Mon Oct  9 00:31:44 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Mon Oct  9 00:31:50 2006
Subject: [gutvol-d] how do i get to this url?
In-Reply-To: <bd3.569ffcd.325979d1@aol.com>
References: <bd3.569ffcd.325979d1@aol.com>
Message-ID: <676E337B-66AA-4D6F-A821-578397E97585@uni-trier.de>

Hi,

	I just tried loading the page and it can up o.k.
	Kind of made the program sluggish till it completely loaded.

	Using: PowerBook G4 (1.5 GB 1,3 Ghz) Mac OSX 10.4.8 and
	Camino Version 2006091101 (1.0.3Int).

	regards
		Keith.

Am 07.10.2006 um 23:44 schrieb Bowerbird@aol.com:

> i said:
> >   why, when i asked for this url:
>
> i forgot to say the bug doesn't happen in all of my browsers,
> just my main one (camino, the newest version, under o.s.x.)...
>
> -bowerbird
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061009/669f2cab/attachment.html
From Bowerbird at aol.com  Tue Oct 10 09:57:34 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 10 09:57:54 2006
Subject: [gutvol-d] credit-lines and generosity
Message-ID: <537.8b7861a.325d2afe@aol.com>

oh geez, some small-minded critics are
giving josh grief on the "posted" listserve
because he put his name on the "credits"
line for the work he did on preparing and
uploading some audio files to the library.

i guess they think that work just happens
magically.   even if whitewashers normally
work without credit, what does credit hurt?
i strongly believe they deserve it, big-time.

not that the credit-lines are all that vital
-- i routinely strip them off the e-texts --
but my goodness, why be so _stingy_?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061010/e63cc565/attachment.html
From joshua at hutchinson.net  Tue Oct 10 11:31:55 2006
From: joshua at hutchinson.net (mailbox@hutchinson.net)
Date: Tue Oct 10 11:54:24 2006
Subject: [gutvol-d] Audiobooks - Bibliographic record file listings
Message-ID: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net>

Howdy all!

As I start to post audiobooks to the archive, I've started getting 
constructive criticism sent my way (thanks to those that have, btw!).  
One item is brought up over and over.

Is there anyway to have the download links say which chapter the link 
is pointed to (or some other human identifiable information)?

Here is an example of The Marvelous Land of Oz (http://www.gutenberg.
org/etext/19466):

Apple iTunes Audiobook		none	738 KB	main site mirror sites P2P
Apple iTunes Audiobook		none	1.98 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.20 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.13 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	1.63 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.65 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.20 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.63 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.64 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.68 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.49 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.11 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.22 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.74 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.41 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.53 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.33 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.27 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	4.69 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.77 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.95 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.36 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	1.80 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	3.18 MB	main site mirror sites P2P
Apple iTunes Audiobook		none	2.26 MB	main site mirror sites P2P

That is the preface through chapter 24, but nothing on that page 
indicates which is which.

Now, I realize that the bibrec page was not designed with audio books 
in mind, so this is in no way meant as an attack on anyone's efforts.  
Rather, this is a question on what can we/I do to make those pages more 
accessible to the end user.

The text file contains a listing of the chapters, and I could create 
an HTML catalog with actual links (which I may do moving forward), but 
it doesn't help the layout of the current bibrec page which is ... 
well, a bit daunting to look at.

Hmm, one idea just came to me, so shoot me down if this is stupid.  
What if the bibrec page did NOT show any of the audio files directly, 
but rather just the link to an HTML document.  Then, when they click 
that, each chapter would be clearly labelled and linked to.  Is this 
stupid?  Is this doable?

Josh
From marcello at perathoner.de  Tue Oct 10 13:58:30 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Tue Oct 10 13:58:33 2006
Subject: [gutvol-d] Audiobooks - Bibliographic record file listings
In-Reply-To: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net>
References: <5107266.1160505115369.JavaMail.?@fh1037.dia.cp.net>
Message-ID: <452C0976.10909@perathoner.de>

mailbox@hutchinson.net wrote:

> Is there anyway to have the download links say which chapter the link
> is pointed to (or some other human identifiable information)?

Not with the current software.

Before implementing file comments (for chapter headings etc.) on the
bibrec page we need some standard way to pass them on to the automatic
cataloger. That means: posting some sort of RDF file along with the files.


> Now, I realize that the bibrec page was not designed with audio books
> in mind, so this is in no way meant as an attack on anyone's
> efforts. Rather, this is a question on what can we/I do to make those
> pages more accessible to the end user.

If you look at this page

  http://www.gutenberg.org/etext/9551

you'll see that the catalog groks two special file types: "readme" and
"index" and sorts them to the top of the list. You can build a nicely
formatted "index" file and post it along with your sound files.


> The text file contains a listing of the chapters, and I could create
> an HTML catalog with actual links (which I may do moving forward),
> but it doesn't help the layout of the current bibrec page which is
> ... well, a bit daunting to look at.
> 
> Hmm, one idea just came to me, so shoot me down if this is stupid. 
> What if the bibrec page did NOT show any of the audio files directly,
> but rather just the link to an HTML document.  Then, when they click
> that, each chapter would be clearly labelled and linked to.  Is this
> stupid?  Is this doable?

We can treat sound files the same way as image files which also don't
show up. In this case, if you don't post an index file, the sound files
will be accessible only through the apache directory listings.


Another problem is that in the past we didn't post text and audio
versions under the same etext number. Thus an etext no. can be declared
AudioBook or not but not both.


Also, people have asked for ways to download all sound files in one
swoop. Maybe we should post a standard playlist format, so people can
use their xampp / winampp to listen to the files?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From joshua at hutchinson.net  Tue Oct 10 18:49:23 2006
From: joshua at hutchinson.net (mailbox@hutchinson.net)
Date: Tue Oct 10 18:49:38 2006
Subject: [gutvol-d] Audiobooks - Bibliographic record file listings
Message-ID: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net>

>----Original Message----
>From: marcello@perathoner.de
>
>If you look at this page
>
>  http://www.gutenberg.org/etext/9551
>
>you'll see that the catalog groks two special file types: "readme" 
and
>"index" and sorts them to the top of the list. You can build a nicely
>formatted "index" file and post it along with your sound files.
>

Excellent.  That looks good.  

Related question: How did the encoding field get populated for that 
one?  Did a cataloger do that by hand?  Is there something I should/can 
do for new audio postings?

>
>Also, people have asked for ways to download all sound files in one
>swoop. Maybe we should post a standard playlist format, so people can
>use their xampp / winampp to listen to the files?
>

Ah, yes, that is a good idea.  Librivox uses the m3u playlist format 
for streaming... That'd probably be well appreciated.  I'll see what I 
can put together on the next posting.

Josh 
From marcello at perathoner.de  Wed Oct 11 03:18:31 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Oct 11 03:18:36 2006
Subject: [gutvol-d] Audiobooks - Bibliographic record file listings
In-Reply-To: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net>
References: <16854452.1160531363250.JavaMail.?@fh1038.dia.cp.net>
Message-ID: <452CC4F7.8070600@perathoner.de>

mailbox@hutchinson.net wrote:

> Related question: How did the encoding field get populated for that 
> one?  Did a cataloger do that by hand?  Is there something I should/can 
> do for new audio postings?

By hand. You don't need that unless you post same file type with
different bitrates.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Wed Oct 11 10:51:36 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 11 10:51:46 2006
Subject: [gutvol-d] talking turkey
Message-ID: <404.6b4518bf.325e8928@aol.com>

well, it's 6 weeks until thanksgiving.

last year, i told david rothman that
i'd buy him a tofu turkey if the people
doing the "hundred dollar laptop"
actually had a machine for sale to
us "ordinary folks" for $200 or less
by this thanksgiving, a rumor that
david reported around that time...

in this blog entry:
>    http://www.teleread.org/blog/?p=3911

david had said, about the $200 quote:
>    The price figure is just speculation, 
>    but it seems realistic to me.

of course, what "seems realistic" to _david_
often seems to be totally unrealistic to me.

in addition to the $200 laptop prediction,
david said that "eventually" we would have
a $50 computer.   i pointed out, in a comment,
that a computer that cost $100 to build would
end up costing about $400 at retail, and david
advised folks to "tune in a year from now, and
we'll see who's right".   that's when i told him
i'd buy him a tofu turkey if his prediction held.

i also told him that, if there was a real computer
available for $50 within the next _five_ years,
i would buy him one, which seems a safe bet,
since in another 5 years, _lunch_ will cost $50,
and i'd be happy to buy david lunch some time.

well, sure enough, over the past year,
the "hundred dollar laptop" has been
rechristened (a number of times) to
take the focus off the _price_ (which 
-- even in volumes of millions of units,
-- doesn't seem to be quite obtainable),
and rumors of retail sales to americans
have been re-floated, but this time with
a pricetag around $450 (with a tax-break
write-off for your "donation" to charity).

so much for whose prediction was right.

someday the one-laptop-per-child project
_will_ create a very cheap laptop, perhaps
even one that can sell (in mass) for $100,
and thus in units of 1 for as little as $200,
but that day won't come in the next 6 weeks.

so david, i guess you had better plan on buying
your own tofu turkey for thanksgiving this year.

so why am i telling gutvol-d all this?

because i've been advising rothman all along to
focus on the _reality_ instead of all the _hype_,
so people start realizing e-books are here now,
instead of around a corner that never gets turned,
with one of the most solid e-book realities being
the long and proud history of project gutenberg,
which is now approaching its 20,000th e-text...

so i'll be giving thanks this year for the volunteers
-- from distributed proofreaders and elsewhere --
who have made this great cyberlibrary possible...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061011/6ef303e8/attachment.html
From hart at pglaf.org  Wed Oct 11 14:06:28 2006
From: hart at pglaf.org (Michael Hart)
Date: Wed Oct 11 14:06:30 2006
Subject: [gutvol-d] talking turkey $100 Laptops
In-Reply-To: <404.6b4518bf.325e8928@aol.com>
References: <404.6b4518bf.325e8928@aol.com>
Message-ID: <Pine.LNX.4.60.0610111405200.32345@pglaf.org>


>From what I understand, Libya has ordered a $100 laptop for each
of its schoolchildren. . . .

I'll try to forward the reference.

mh
From hart at pglaf.org  Wed Oct 11 14:07:04 2006
From: hart at pglaf.org (Michael Hart)
Date: Wed Oct 11 14:07:06 2006
Subject: [gutvol-d] $100 Laptop... (fwd)
Message-ID: <Pine.LNX.4.60.0610111406510.32345@pglaf.org>


---------- Forwarded message ----------
Date: Wed, 11 Oct 2006 12:32:39 -0700 (PDT)
Subject: $100 Laptop...

Lybia has just ordered $100 laptops for their 1.2 million school children.

http://www.agoravox.com/article.php3?id_article=5235&id_forum=3267&var_mode=recalcul#forum

This should put the production line for these bad boys on a very firm footing.  Great stuff!
From joey at joeysmith.com  Wed Oct 11 14:16:08 2006
From: joey at joeysmith.com (joey)
Date: Wed Oct 11 14:34:48 2006
Subject: [gutvol-d] talking turkey $100 Laptops
In-Reply-To: <Pine.LNX.4.60.0610111405200.32345@pglaf.org>
References: <404.6b4518bf.325e8928@aol.com>
	<Pine.LNX.4.60.0610111405200.32345@pglaf.org>
Message-ID: <20061011211608.GA29634@joeysmith.com>

On Wed, Oct 11, 2006 at 02:06:28PM -0700, Michael Hart wrote:
> 
> >From what I understand, Libya has ordered a $100 laptop for each
> of its schoolchildren. . . .
> 
> I'll try to forward the reference.
> 
> mh

You're probably looking for http://www.nytimes.com/2006/10/11/world/africa/11laptop.html
From Bowerbird at aol.com  Wed Oct 11 15:06:16 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 11 15:06:24 2006
Subject: [gutvol-d] $100 Laptop... (fwd)
Message-ID: <c41.512d649.325ec4d8@aol.com>

michael said:
>    This should put the production line for these bad boys 
>    on a very firm footing.? Great stuff!

yes, it was this news that reminded me of the prediction
that rothman had made last year; that's why i checked it...

however, i think o.l.p.c. is waiting until they have orders for
5 million before they're proceeding.   (i think they already
had a few million ordered, so maybe this libyan order will
get them over that hump.)   and yes, i think it's _great_ that
nations are now putting in their orders.   i would think that --
given the _half-trillion_dollars_ that have evaporated in iraq
-- it would be cost-effective for the united states to donate
5-10 million machines to various countries around the world,
to get this project going, and polish our tarnished reputation.

but, as you know, i'm not running the country.   "the decider" is...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061011/f133006b/attachment.html
From Bowerbird at aol.com  Fri Oct 13 14:40:31 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 13 14:40:40 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <417.ce757e8.326161cf@aol.com>

remember years back, on this very listserve,
when we had a long-running series of threads
on my "zen markup language", where detractors
said it couldn't possibly provide sufficient detail
to delineate all of the structures found in books,
and i replied by making a list of those structures
and formulating a test document containing them?

>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml

and remember how nobody ever came up with a
structure that i had missed, indicating my list was
sufficiently strong, inclusive, and exhaustive?

well, jon noring -- better late than never -- is now
running through the same exercise on his listserve.

why?   because he has this "idea" about creating an
authoring-tool that would spit out various types of
e-book formats, like .html and .lit and so on.

and then david rothman writes up a teleblog entry
about this "cool idea by jon noring" as if _nobody_
in the world had ever had it before, let alone already 
created such an authoring tool.

and then robert nagel (of idiotprogrammer.com)
comments that "wow, if only such a tool existed!"

and the mutual hype society completes another cycle.

i mean, i'm really _glad_ that jon has come to admit
the importance of an authoring-tool.   i've been telling
him that he needed to recognize that for _years_ now.

but hey, we're already in year 2006, soon to be 2007.
if you're just now catching up to the "cool idea" of an
_authoring-tool_, then you will need to speed up the
process, especially if you are an "expert" in e-books,
as jon is very fond of having himself described.

so maybe someone should tell jon that i already have
the authoring-tool thing covered, and he can move
immediately to the next stage of his development, ok?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061013/7fd15cd3/attachment.html
From jon at noring.name  Fri Oct 13 16:23:39 2006
From: jon at noring.name (Jon Noring)
Date: Fri Oct 13 16:23:48 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <417.ce757e8.326161cf@aol.com>
References: <417.ce757e8.326161cf@aol.com>
Message-ID: <1191732505.20061013172339@noring.name>

Bowerbird wrote:

>  well, jon noring -- better late than never -- is now
>  running through the same exercise on his listserve.
>  
>  why??  because he has this "idea" about creating an
>  authoring-tool that would spit out various types of
>  e-book formats, like .html and .lit and so on.

*laugh!!!*

I knew you were going to bring this up here.

The simple answer is that I've been talking about this since The eBook
Community started in 1996. The Yahoo archive dates from mid-1999, and
just do a search there, and you will find my comments. Try "authoring
tool" for one search term, but this won't catch everything I've talked
about on this topic.

So your claim that I've only recently "seen the light" is simply
historical revisionism. It's wonderful to have searchable archives.
And the full ebook-list/TeBC archive will hopefully soon be on Google.


>  and then david rothman writes up a teleblog entry
>  about this "cool idea by jon noring" as if _nobody_
>  in the world had ever had it before, let alone already 
>  created such an authoring tool.

Nope, no one has issued a fairly simple to use ebook authoring tool
which exports into all the common ebook formats today, and into any
ebook format envisioned for the future.

And this includes you, Bowerbird -- you are not there yet. Call or
email me when you are finished and ready to market it to small
publishers. Of course, they'll ask for high-quality PDF output which
they control all formatting and the quality of typesetting is at least
as good as Word, LIT, Mobipocket, RTF, OEBPS, OpenReader, the various
flavors of Palm formats (can't keep them straight), etc.

Funny thing, though, publishers will not ask for plain text.


>  i mean, i'm really _glad_ that jon has come to admit
>  the importance of an authoring-tool.?  i've been telling
>  him that he needed to recognize that for _years_ now.

Wow! I'd never have known. <laugh/>


>  so maybe someone should tell jon that i already have
>  the authoring-tool thing covered, and he can move
>  immediately to the next stage of his development, ok?

I look forward to your producing a publisher-ready version of your
tool which will export into all the common ebook formats of today and
the foreseeable future. The world needs one! Here's your chance for
glory (which I know you are not seeking, humble you.)

Btw, do you plan to open source your tool? And if not, why not?

Jon Noring

(p.s., Bowerbird, have you approached small ebook publishers with your
tool to see if they will embrace it?)

From lee at novomail.net  Fri Oct 13 14:48:49 2006
From: lee at novomail.net (Lee Passey)
Date: Fri Oct 13 16:27:23 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <417.ce757e8.326161cf@aol.com>
References: <417.ce757e8.326161cf@aol.com>
Message-ID: <453009C1.202@novomail.net>

Bowerbird@aol.com wrote:

[snip]

> so maybe someone should tell jon that i already have
> the authoring-tool thing covered, and he can move
> immediately to the next stage of his development, ok?
> 
> -bowerbird

"Said the pieman to Simple Simon, 'Show me first your penny.'"

-- 
Nothing of significance below this line.

From Bowerbird at aol.com  Fri Oct 13 17:12:21 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 13 17:12:28 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <57c.6c45766.32618565@aol.com>

if anyone wants a plain-text to (x)html authoring tool,
i suggest they check out "markdown" for starters...

>    http://daringfireball.net/projects/markdown/

you can play around with it using its "dingus":
>    http://daringfireball.net/projects/markdown/dingus

my own interest is in _disintermediating_ publishers,
not marketing my programs to them.

have a nice weekend...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061013/eec6eeb6/attachment.html
From jon at noring.name  Fri Oct 13 17:19:10 2006
From: jon at noring.name (Jon Noring)
Date: Fri Oct 13 17:25:51 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <57c.6c45766.32618565@aol.com>
References: <57c.6c45766.32618565@aol.com>
Message-ID: <7810100940.20061013181910@noring.name>

Bowerbird wrote:

> if anyone wants a plain-text to (x)html authoring tool,
> i suggest they check out "markdown" for starters...
>  
>>??  http://daringfireball.net/projects/markdown/
>  
>  you can play around with it using its "dingus":
>>??  http://daringfireball.net/projects/markdown/dingus

Excellent reference. I've been monitoring the 'markdown' mailing list
for some time now. Interesting discussions there...


>  my own interest is in _disintermediating_ publishers,
>  not marketing my programs to them.

Well, now that we got your position on the matter...


Jon

From Bowerbird at aol.com  Sat Oct 14 00:38:39 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct 14 00:38:56 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <ca0.235ca1.3261edff@aol.com>

i said:
> ? my own interest is in _disintermediating_ publishers

which, by the way, is why i remain staunchly anti-d.r.m.

some "experts" want to get "buy-in" from the publishers,
so they are willing to do something as stupid as putting
_locks_ on books (so as to turn 'em into cash-registers).
what a supremely stupid attitude.

one beauty of electronic-books and cyberspace is that
we can _free_ ourselves from the shackles placed on us
by the greedy-rich-boy middlemen who now siphon off
a boatload of the cash between artists and the audience,
and desperately wish to maintain their "business model".

why in the world you'd want to get "buy-in" from these
thieves is totally beyond me.   and that's _my_ position...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061014/ac8c0997/attachment.html
From Bowerbird at aol.com  Sat Oct 14 15:56:01 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct 14 15:56:11 2006
Subject: [gutvol-d] re: oh geez, part 3
Message-ID: <457.6871ecdd.3262c501@aol.com>

it occurs to me that there is a need
to state the relevance to gutvol-d...

we want to be able to create books
in the formats people want them...

as most of you already know well,
david moynihan over at blackmask
-- whom i _support_ in his lawsuit,
unlike many fair-weather friends --
managed to covert the plain-text
e-texts from project gutenberg
to a wide variety of e-book formats,
and he did it _automatically_ using
scripts that he developed himself...

in contrast, the xslt workflow that
many posit as the mechanism that
turns x.m.l. files into various formats
still hasn't been developed or proven.

in a nutshell, conversions are not hard,
not for me.   might be hard for others,
but they're not hard for me.   that is all...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061014/70bcee4e/attachment.html
From Bowerbird at aol.com  Sun Oct 15 14:21:39 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sun Oct 15 14:21:50 2006
Subject: [gutvol-d] online editing of documents, collaboratively
Message-ID: <46c.2324ee45.32640063@aol.com>

of course, the future of online documents is already here,
what with the arrival of web-apps to do word-processing,
which allows long-distance collaboration between people.

for instance, that test-suite that i pointed to just recently?
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml

here it is in an incarnation using google's online tool:
>    http://docs.google.com/View?docid=dgczchnc_0d9c9b6

to experiment, i re-did all the styling, links, etc., from scratch,
which was quite unnecessarily painful for me, since i am now
accustomed to the automatic formatting done by my tools, but
i'm confident that google will eventually catch up to me...      :+)

but notice that you can upload a document in various formats,
and i would assume that styling, links, etc., would be retained...

conversely, also notice that once a document is up, google handles
conversions to other formats, i.e., .html, .pdf, .rtf, .doc, and ascii...
i'm guessing that amazon's version of this tool will include a routine
that also converts into mobipocket format, wouldn't you think?      :+)

anyway, this is how digitization efforts should be done in 2007,
not the clunky markup-based way that distributed proofreaders
is settling on for its workflow.

-bowerbird

p.s.   thanks to the spellcheck feature of google's online tool, i found
a typo in my file.   darn, it's totally amazing how those things creep in!

p.p.s.   if you're allergic to google, this tool is getting good reviews:
>    http://www.zohowriter.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061015/724b201b/attachment.html
From hyphen at hyphenologist.co.uk  Sun Oct 15 23:58:17 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Mon Oct 16 00:05:08 2006
Subject: [gutvol-d] online editing of documents, collaboratively
In-Reply-To: <46c.2324ee45.32640063@aol.com>
References: <46c.2324ee45.32640063@aol.com>
Message-ID: <l2b6j21lb71u3dipbht0qebqm23coeofv1@4ax.com>

On Sun, 15 Oct 2006 17:21:39 EDT,  Bowerbird@aol.com wrote:

I had a quick look at:
|>    http://docs.google.com/View?docid=dgczchnc_0d9c9b6

and found
>> chapter 13 -- unlucky 13 
>> there is no 13th floor in most buildings.

Not true over most of the world.  
Just a *silly* USian cultural Oddity.

This suggests that Chapter 13 in PG Books be renamed somehow, which is
*bad*.

-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From traverso at dm.unipi.it  Mon Oct 16 00:31:22 2006
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Mon Oct 16 00:29:36 2006
Subject: [gutvol-d] online editing of documents, collaboratively
In-Reply-To: <l2b6j21lb71u3dipbht0qebqm23coeofv1@4ax.com> (message from Dave
	Fawthrop on Mon, 16 Oct 2006 07:58:17 +0100)
References: <46c.2324ee45.32640063@aol.com>
	<l2b6j21lb71u3dipbht0qebqm23coeofv1@4ax.com>
Message-ID: <200610160731.k9G7VMD14905@pico.dm.unipi.it>

>>>>> "Dave" == Dave Fawthrop <hyphen@hyphenologist.co.uk> writes:

    Dave> On Sun, 15 Oct 2006 17:21:39 EDT, Bowerbird@aol.com wrote: I
    Dave> had a quick look at: |>
    Dave> http://docs.google.com/View?docid=dgczchnc_0d9c9b6

    Dave> and found
    >>> chapter 13 -- unlucky 13 there is no 13th floor in most
    >>> buildings.

    Dave> Not true over most of the world.  Just a *silly* USian
    Dave> cultural Oddity.

I disagree. In most of the word there is no 13th floor in most
buildings: most buildings have less than 13 floors. 

(and if they have, there is a 13th floor, even if you name it
differently, or if you don't name it at all) 

Carlo

From joey at joeysmith.com  Mon Oct 16 02:25:03 2006
From: joey at joeysmith.com (joey)
Date: Mon Oct 16 02:31:27 2006
Subject: [gutvol-d] re: oh geez, part 3
In-Reply-To: <457.6871ecdd.3262c501@aol.com>
References: <457.6871ecdd.3262c501@aol.com>
Message-ID: <20061016092503.GB29634@joeysmith.com>

On Sat, Oct 14, 2006 at 06:56:01PM -0400, Bowerbird@aol.com wrote:
> in contrast, the xslt workflow that
> many posit as the mechanism that
> turns x.m.l. files into various formats
> still hasn't been developed or proven.

Simply not true -- not for publishing in general, and not for PG-related
projects specifically.

I use XSLT to publish to HTML, PDF, plain text, and OASIS Open Document
Format [among others] from docbook and similar formats on a daily basis.

Additionally, I have previously shown XSLT stylesheets on gutvol-p that
took some XML provided by Greg (a bunch of Dickens works) which output
HTML and plain text. I stopped short of the PDF at the time because I
was not the only person working on the project, and it seemed to me that
some of the others were further along than I.

Additionally, I take exception to your assertion earlier in this list:

> and remember how nobody ever came up with a
> structure that i had missed, indicating my list was
> sufficiently strong, inclusive, and exhaustive?

I chose not to indulge your mania further. That does not mean I never
came up with a structure that you had missed. In fact, I found it trivial
to come up with such. In fact, this is classical logical fallacy, known
commonly as "Argument from ignorance"...that is, "a premise is true only
because it has not been proven false".

All of that aside, please stop trying to import arguments from other fora
into this one. If I wanted to know the latest on what Jon or David or
the Teleblog community had to say on matters, I would seek it from them -
not indirectly from a known detractor to their cause via the PG mailing
lists. You have your own blog, please use that to stump for yourself.
From prosfilaes at gmail.com  Mon Oct 16 04:13:07 2006
From: prosfilaes at gmail.com (David Starner)
Date: Mon Oct 16 04:13:11 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <417.ce757e8.326161cf@aol.com>
References: <417.ce757e8.326161cf@aol.com>
Message-ID: <6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com>

On 10/13/06, Bowerbird@aol.com <Bowerbird@aol.com> wrote:
>  and remember how nobody ever came up with a
>  structure that i had missed, indicating my list was
>  sufficiently strong, inclusive, and exhaustive?

I'm curious whether I had you in my kill-file by then (which I still
do, but gmail makes it too easy to drag out killed messages), or if
this is just a deeply skewed memory. I suspect the latter, since your
test document says "there aren't a whole lot of tables in the e-texts
-- we're talking literature, not spreadsheets -- but your system
should handle tables anyway; not really big and hairy ones, just
simple ones", and http://www.pgdp.net/phpBB2/viewtopic.php?t=4311
shows some really big and hairy tables found in real PG etexts. The
test document certainly shows no evidence of the arbitrary evilness
the Early English Text Society and friends saw fit to hand us; heck,
it doesn't even show how to handle sidenotes, those things ubiquitous
in pre-18th century printing. Not to mention math. Why am I even
bothering to try and prove that statement comes out of your own little
world?
From bill at williamtozier.com  Mon Oct 16 05:14:07 2006
From: bill at williamtozier.com (William Tozier)
Date: Mon Oct 16 05:21:00 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com>
References: <417.ce757e8.326161cf@aol.com>
	<6d99d1fd0610160413r269a9ad2p7443fa53bd7d7981@mail.gmail.com>
Message-ID: <45930876-D41E-400A-AEB1-A87F57FF613E@williamtozier.com>


On Oct 16, 2006, at 7:13 AM, David Starner wrote:

> Why am I even
> bothering to try and prove that statement comes out of your own little
> world?

We all lapse, now and then. You beat me to it this time by a matter  
of minutes. Your backslide keeps me from jumping in as well. :)

Altruism.
-----
Bill Tozier
AIM:    vaguery@mac.com
blog:   http://williamtozier.com/slurry
plazes: http://beta.plazes.com/user/BillTozier
skype:  vaguery

"Nature, however picturesque, never yet made a poet of a dullard."
   --Hjalmar Hjorth Boyesen


From Bowerbird at aol.com  Mon Oct 16 09:21:46 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 09:22:05 2006
Subject: [gutvol-d] re: oh geez, part 3
Message-ID: <c59.46780a6.32650b9a@aol.com>

joey said:
>    I have previously shown XSLT stylesheets on gutvol-p that 
>    took some XML provided by Greg (a bunch of Dickens works) 
>    which output HTML and plain text.

let's see that work joey.


>   I chose not to indulge your mania further. That does not mean 
>    I never came up with a structure that you had missed.

let's hear them, joey.


>   please stop trying to import arguments from other fora

i've stated the relevance each time.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/3d7d868e/attachment.html
From Bowerbird at aol.com  Mon Oct 16 10:04:15 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 10:05:08 2006
Subject: [gutvol-d] online editing of documents, collaboratively
Message-ID: <595.444e5f55.3265158f@aol.com>

dave said:
>    Just a *silly* USian cultural Oddity.

yeah right.

i was thinking we'd made it through
the recent friday-the-13th when --
rumble! -- earthquake rocks hawaii.

unrelated, you say?   sure it's "unrelated".


>    This suggests that Chapter 13 in PG Books 
>    be renamed somehow, which is *bad*.

ok, i withdraw that "suggestion"...          ;+)

***

carlo said:
>    In most of the word there is no 13th floor in most buildings: 
>    most buildings have less than 13 floors.

bingo!


>    (and if they have, there is a 13th floor, even if you 
>    name it differently, or if you don't name it at all)

semantics!

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/7936fd96/attachment.html
From Bowerbird at aol.com  Mon Oct 16 10:12:34 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 10:12:38 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <c2c.8df89e.32651782@aol.com>

david said:
>    your test document says "there aren't a whole lot of tables in the 
e-texts
>    -- we're talking literature, not spreadsheets -- but your system should 
>    handle tables anyway; not really big and hairy ones, just simple ones", 
>    and http://www.pgdp.net/phpBB2/viewtopic.php?t=4311
>    shows some really big and hairy tables found in real PG etexts. 

surely you don't think that pulling one e-text out of the whole library
-- or even 100 of them! -- negates that general statement, do you?

even _600_ exceptions would be just 3% of the ~20,000 e-texts now.

but let's get down to brass tacks, shall we?

in saying that "your system should be able to handle simple tables",
i'm just laying out a minimum requirement that would suffice here,
for the other people that might want to develop their system for p.g.

my own system will eventually be able to handle quite complex tables,
when i find the need to develop it that far.   and if you'd like some proof,
then hand me a list of 100 e-texts that use tables, and i will tackle them
first when the time for "attacking tables" comes up big on my agenda...
(and leave out the spalding baseball guides, i already know about them.)

in the meantime, if you think "tables" is something that you can point to
as a "shortcoming" in my list, then you really need to rethink.   i am quite
well aware of tables, and even included them in my test-suite, thank you.


>    The test document certainly shows no evidence of the arbitrary evilness
>    the Early English Text Society and friends saw fit to hand us; 

arbitrariness?   if i want to point to _arbitrariness_, i will point to
the inconsistencies in the production of the e-texts themselves,
which are riddled with inconsistencies.   i don't need to point to
work from artisans of the previous century, or the century before,
work that was done _manually_, for the most part, and not aided
by computers that should help make things much more uniform.

you want "arbitrary" today?   take a good look at the .html versions
that have come out of distributed proofreaders over the last 6 years.
it's just a shame that all of the hard work that went into making 'em
is going to have to be tossed out, regretfully, when future digitizers
conclude that it's simply _easier_ to re-do the work -- from scratch --
than to try to puzzle out the unique make-up of each of those files...

(and yes, david, i do know that you have been one of the voices over
on the d.p. boards in favor of greater standardization of the .html,
so i salute you for taking that stand there; you are on the right side.)


>    heck, it doesn't even show how to handle sidenotes, 
>    those things ubiquitous in pre-18th century printing.

some sidenotes are essentially headings, and thus should be treated
in that manner.   others are annotations, and should be treated as such.
it is only your carelessness that now lumps both of these cases together.

really, david, if you want to "prove" something or other, you're going to
have to work much harder than this.   do you really think that i've thought
about it and examined literally thousands of books, and not encountered
some "sidenotes" on one occasion or another?   do you really believe that
i haven't rolled my eyes time after time when sidenotes were "discussed"
on the d.p. boards?   (ditto for "small caps" markup in the last 6 months;
markup has done to you guys what it always does to people, which is to
get them embroiled in minutia such that they badly lose the big picture.)


>    Not to mention math. 

oh david, you've mentioned "math" many times.   over and over again, david.

and over and over again, i've replied that i'll handle equations as graphics.

eventually, if there is a compelling need, i might even adapt one of the
existing plain-text solutions for rendering graphics (tex, anyone?) to
do the job.   but i doubt there will ever be such a "compelling need" to
pull math equations out of books that are, in most cases, 80+ years old.

(and david, usually you mention "music" along with math.   how come
you didn't do that this time?   perhaps you've forgotten the drill, man.)


>    Why am I even bothering to try and prove that statement 
>    comes out of your own little world?

yes, david, why are you even bothering to _try_ and do
something that you are so clearly incapable of doing?
something that you have _never_ been able to do before?
the only way you're going to "defeat" me is to put me back
into your "kill" file.   stick your head in the ground, ostrich...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/eb15df6d/attachment-0001.html
From Bowerbird at aol.com  Mon Oct 16 11:36:17 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 11:36:23 2006
Subject: [gutvol-d] any more questions, lee?
Message-ID: <3b3.9849ea8.32652b21@aol.com>

lee said:
>   "Said the pieman to Simple Simon, 
>    'Show me first your penny.'"

so lee, do you have any more questions
about simple-simon authoring-tools?

if you do, i'll be happy to answer 'em...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/79794354/attachment.html
From cannona at fireantproductions.com  Mon Oct 16 11:37:18 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Mon Oct 16 11:37:39 2006
Subject: [gutvol-d] oh geez, part 3
References: <c2c.8df89e.32651782@aol.com>
Message-ID: <001d01c6f152$279e35d0$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bowerbird wrote:
> oh david, you've mentioned "math" many times.   over and over again,
> david.
>
> and over and over again, i've replied that i'll handle equations as
> graphics.

Doesn't sound terribly accessible.


>
> eventually, if there is a compelling need, i might even adapt one of the
> existing plain-text solutions for rendering graphics (tex, anyone?) to
> do the job.   but i doubt there will ever be such a "compelling need" to
> pull math equations out of books that are, in most cases, 80+ years old.

For that matter, why would we even care about crappy old books in the first
place.

Aaron Cannon

- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFM9FyI7J99hVZuJcRAtBqAKDwQRMmqqOeCtyYa3S0VK/f18AkNwCgrkzP
/sqIqFvhWXkoQjC7UESuvAM=
=ro11
-----END PGP SIGNATURE-----

From prosfilaes at gmail.com  Mon Oct 16 12:04:09 2006
From: prosfilaes at gmail.com (David Starner)
Date: Mon Oct 16 12:04:12 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <c2c.8df89e.32651782@aol.com>
References: <c2c.8df89e.32651782@aol.com>
Message-ID: <6d99d1fd0610161204q100ba7cfj7559963b4b7b38f6@mail.gmail.com>

On 10/16/06, Bowerbird@aol.com <Bowerbird@aol.com> wrote:
>  even _600_ exceptions would be just 3% of the ~20,000 e-texts now.

You claimed that your list was "strong, inclusive, and exhaustive",
but only handle 97% of the texts? There's very few fields where a 97%
success rate is considered good enough, outside head-to-head
competitions; usually, if a 97% success rate is tolerated, there's
research to make it better, but it's a hard enough problem that 97% is
the best possible now. That's not the case here.

>  my own system will eventually be able to handle quite complex tables,
>  when i find the need to develop it that far.

So your system doesn't currently handle everything necessary.

>  >   The test document certainly shows no evidence of the arbitrary evilness
>  >   the Early English Text Society and friends saw fit to hand us;
>
>  arbitrariness?

So you avoid the point; that your test-suite doesn't consist of real
life problems and isn't nearly as painful as the real life problems I
see.

>  >   heck, it doesn't even show how to handle sidenotes,
>  >   those things ubiquitous in pre-18th century printing.
>
>  some sidenotes are essentially headings, and thus should be treated
>  in that manner.  others are annotations, and should be treated as such.
>  it is only your carelessness that now lumps both of these cases together.

That's an editor's job. I'm not an editor; I merely want to reproduce
the text as is.

>  >   Not to mention math.
>
>  oh david, you've mentioned "math" many times.  over and over again, david.
>
>  and over and over again, i've replied that i'll handle equations as
> graphics.

Oh, wonderful. Let's reproduce the tragic failures of equation
typesetting, and add the problem of the font used for the text and
font used for the equations have no similarity. I scan math books
frequently to get a more legible copy, not something that preserves
all the failures of the original typesetting.

>  eventually, if there is a compelling need, i might even adapt one of the
>  existing plain-text solutions for rendering graphics (tex, anyone?) to
>  do the job.  but i doubt there will ever be such a "compelling need" to
>  pull math equations out of books that are, in most cases, 80+ years old.

The way to sway people to use your programs is not to dismiss the
things they consider as compelling as unimportant.

>  yes, david, why are you even bothering to _try_ and do
>  something that you are so clearly incapable of doing?

When you tout a solution that doesn't fix our problems and wonder why
we don't flock to it, I'd say you're in your own little world.
From hyphen at hyphenologist.co.uk  Mon Oct 16 12:06:28 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Mon Oct 16 12:06:48 2006
Subject: [gutvol-d] online editing of documents, collaboratively
In-Reply-To: <595.444e5f55.3265158f@aol.com>
References: <595.444e5f55.3265158f@aol.com>
Message-ID: <svl7j2tkd9hbeamanc45p1sinpbonce165@4ax.com>

On Mon, 16 Oct 2006 13:04:15 EDT,  Bowerbird@aol.com wrote:

|dave said:
|>    Just a *silly* USian cultural Oddity.
|
|yeah right.
|
|i was thinking we'd made it through
|the recent friday-the-13th when --
|rumble! -- earthquake rocks hawaii.
|
|unrelated, you say?   sure it's "unrelated".

Our old friend chance.  
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From Bowerbird at aol.com  Mon Oct 16 12:12:06 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 12:12:18 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <c35.55ba4d2.32653386@aol.com>

aaron said:
>    Doesn't sound terribly accessible.

"if there is a compelling need..."

(but what do you suggest as being more "accessible"?)


>    For that matter, why would we even care about 
>    crappy old books in the first place.

seems like a rather silly position for you to adopt, aaron.

and _quite_ a risky one to be trying to put in _my_ mouth.

classic literature doesn't "get dated", which is why the
"crappy old books" that contain it deserve our attention.

_math_ books, on the other hand, and most especially
their _equations_, don't fall in quite the same category.

either those equations have become "classic" themselves,
in which case there is little need to do anything more than
present them as illustrations, or they have become _dated_
(in light of further developments), in which case there is no
need to do anything more than present them as illustrations.
did you notice the symmetry there?

at any rate, i am sure that math people have developed ways
to share their work with each other via plain-text e-mail, and
-- if the need arises -- i will hear from them how they do that,
and incorporate those conventions into zen markup language.

meanwhile, for the 99.7% of the project gutenberg library which
currently has no need _at_all_ (let alone any _compelling_ need)
for math equations, i don't have to worry about them, thank you.

***

oh, and by the way, it is the very fact that _these_ are the kind of
"objections" that are made to my list of structures that informs me
that that list is sufficiently complete i don't have to worry about it.

if you guys had anything _substantive_ to say, you certainly would,
and i would simply say "thanks" and add it to my list of structures...

as it is, you've now had _years_ to think about it and scour books to
try and find something that falls outside my list, and you've got zilch.

and hey, here's a quick piece of advice you might consider:
when you've got _zilch_, the best thing to do is stay silent...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/a5012c66/attachment.html
From joshua at hutchinson.net  Mon Oct 16 12:26:13 2006
From: joshua at hutchinson.net (mailbox@hutchinson.net)
Date: Mon Oct 16 12:26:24 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <11633580.1161026773348.JavaMail.?@fh1064.dia.cp.net>

----Original Message----
From: Bowerbird@aol.com

> and hey, here's a quick piece of advice you might consider:
> when you've got _zilch_, the best thing to do is stay silent...


***

And yet you keep making noise...

Josh

PS  Just because you dismiss everyone arguments, doesn't mean they 
aren't valid.  It just makes you look like a small child that has his 
fingers in his ears, yelling, "La-la-la-la.  I can't hear you!  La-la-
la-la-laaaaa!"

From Bowerbird at aol.com  Mon Oct 16 12:31:43 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 12:31:54 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <cd0.517866.3265381f@aol.com>

david said:
>    You claimed that your list was 
>    "strong, inclusive, and exhaustive",
>    but only handle 97% of the texts?

97% on the _first_pass_, david.   i ain't done yet.
how much is the "official" .tei handling right now?


>    So you avoid the point; that your test-suite 
>    doesn't consist of real life problems and isn't 
>    nearly as painful as the real life problems I see.

well, as soon as we've got some systems that can pass
_this_ test-suite, then we can start making it tougher
with some of the features that are rare in the library.

i have repeatedly suggested that the x.m.l. advocates
and the p.g.t.e.i. freaks should show us how they would
mark up this test-suite and convert it to various formats.
but i have had no takers...

as you noted, 97% _is_ good if it's head-to-head against
something that's not as high.   and so far anyway, it's like
my 97% against .tei vapor...


>   That's an editor's job. I'm not an editor; 
>    I merely want to reproduce the text as is.

yes, i _am_ an editor.   any time you are
creating a new version -- and a digitized
version of a paper-book _is_ a new version
-- you'd better be prepared to be an editor.

i don't see any point in "reproducing the text."
if i wanna see what the paper-book looked like,
i much prefer to just go take a look at the scans.

what i want to create is something that _works_well_
as an electronic-book, not that mimics a paper-book.


>   I scan math books frequently to get a more legible copy, 
>    not something that preserves all the failures of 
>    the original typesetting.

you're being inconsistent.   do you preserve the original, or not?


>   The way to sway people to use your programs is 
>    not to dismiss the things they consider as compelling 
>    as unimportant.

um, sorry, you're not important enough for me to "sway".
i'm just putting myself on the record, so i can return years 
from now and say "i told you so".   believe whatever you will.


>    When you tout a solution that doesn't fix our problems
>    and wonder why we don't flock to it, 
>    I'd say you're in your own little world.

and i am happy to be in "my own little world" instead of yours.
it would be quite disconcerting to me to have a bunch of idiots
suddenly "flock" into my world; i'd have to rethink _everything_.

-bowerbird

p.s.   and i _still_ haven't added anything to my list of structures...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/3ebcdbfe/attachment.html
From Bowerbird at aol.com  Mon Oct 16 12:34:28 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 12:34:33 2006
Subject: [gutvol-d] online editing of documents, collaboratively
Message-ID: <c5e.431f598.326538c4@aol.com>

meanwhile, does anyone have anything substantial to say 
concerning the online editing of documents, collaboratively?

or is this another great tool we're gonna pretend doesn't exist?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/8a0fa572/attachment.html
From Bowerbird at aol.com  Mon Oct 16 12:38:07 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 12:38:18 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <c48.5b275ee.3265399f@aol.com>

josh said:
>    Just because you dismiss everyone arguments, 
>    doesn't mean they aren't valid.? It just makes you 
>    look like a small child that has his fingers in his ears, 
>    yelling, "La-la-la-la.? I can't hear you!? La-la-la-la-laaaaa!"

josh's post = troll.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/9defe4e7/attachment.html
From jon at noring.name  Mon Oct 16 11:50:48 2006
From: jon at noring.name (Jon Noring)
Date: Mon Oct 16 12:49:34 2006
Subject: [gutvol-d] Content MathML (was "oh geez, part 3")
In-Reply-To: <001d01c6f152$279e35d0$0300a8c0@blackbox>
References: <c2c.8df89e.32651782@aol.com>
	<001d01c6f152$279e35d0$0300a8c0@blackbox>
Message-ID: <1269046842.20061016125048@noring.name>

Aaron wrote:
> Bowerbird wrote:

>> oh david, you've mentioned "math" many times.   over and over again,
>> david.
>>
>> and over and over again, i've replied that i'll handle equations as
>> graphics.

> Doesn't sound terribly accessible.

Yep. "The blind be damned".


>> eventually, if there is a compelling need, i might even adapt one of the
>> existing plain-text solutions for rendering graphics (tex, anyone?) to
>> do the job.   but i doubt there will ever be such a "compelling need" to
>> pull math equations out of books that are, in most cases, 80+ years old.

> For that matter, why would we even care about crappy old books in the first
> place.

Aaron brings up a good point that there is contemporary content being
produced. Even if the goal is to disintermediate publishers, one still
has to handle mathematical equations in a way which benefits users.

This is what is intriguing about the content flavor of MathML, where
many (but not all) mathematical expressions can be made "understandable"
by mathematics software. This allows the ebook containing such markup to
be able to directly call such programs for plotting, solving, etc.

The introduction of this chapter about content MathML in the MathML
spec is excellent:

   http://www.w3.org/TR/MathML2/chapter4.html

So, how would ZML handle semantic MathML markup? It is, of course in
"evil XML", so would that not be allowed in ZML?

Jon Noring

From joshua at hutchinson.net  Mon Oct 16 12:50:08 2006
From: joshua at hutchinson.net (mailbox@hutchinson.net)
Date: Mon Oct 16 12:50:11 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <27903894.1161028208906.JavaMail.?@fh1064.dia.cp.net>


----Original Message----
From: Bowerbird@aol.com
>> david said:
>>    You claimed that your list was 
>>    "strong, inclusive, and exhaustive",
>>    but only handle 97% of the texts?
>
> 97% on the _first_pass_, david.   i ain't done yet.
> how much is the "official" .tei handling right now?

Everything you've mentioned AND everything David mentioned (there might be something in the Early English stuff he mentioned it can't do, though at least one of them has been done in PGTEI).

> i have repeatedly suggested that the x.m.l. advocates
> and the p.g.t.e.i. freaks should show us how they would
> mark up this test-suite and convert it to various formats.
> but i have had no takers...

Well, I don't know what "test-suite" of texts you refer to, but honestly, I usually ignore most of your pointless ramblings because I'm too busy actually DOING something.  You know, like posting close to 100 books in TEI format to the PG archives. (And before anyone calls me on it, *I* haven't posted that many; I'm including other people's efforts on TEI in that "close to 100".  There have been 3 other people that have posted books in TEI format that I know of.)

Josh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/1e5aec35/attachment.html
From Bowerbird at aol.com  Mon Oct 16 13:00:50 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 13:00:59 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <c16.71ed3fe.32653ef2@aol.com>

josh said:
>    Well, I don't know what "test-suite" of texts you refer to, but 
>    honestly, I usually ignore most of your pointless ramblings

there's a frank admission, folks.

he doesn't even know what's being discussed,
but he feels qualified to throw in a few insults.

speaks for itself, that post does, and speaks volumes.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/2f2f0ee4/attachment.html
From Bowerbird at aol.com  Mon Oct 16 13:12:56 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 13:13:08 2006
Subject: [gutvol-d] oh geez, part
Message-ID: <cab.df21f8.326541c8@aol.com>

and since we've degenerated to the troll level,
let me just cut to the chase at the finale, ok?

ultimately, for music, i will probably do exactly
what d.p. has done, and use lilypond or finale,
and route that file to either an external player
or one that i have embedded in my viewer-app.
unlike music-markup-language, lilypond shares
my core philosophy of simplicity and elegance...

i'll follow the same approach for math equations,
routing them to an equation editor that is either
(a) an external app, or (b) embedded in my viewer.
i'd guess it will probably be tex-based rather than
math-markup-language, as tex is widely preferred,
and expressible in utf-8.   (math-markup-language
is also expressible in utf-8, but it's also got all that
angle-bracket gunk in it, which i'm badly allergic to.)

so, as usual, it's simple as pie for me to "answer" your
"objections", i just wanna see how desperate you get.

and none of this needs to go in my test-suite yet.

(but, so you know, i've already got .mp3 support,
and .aiff and a bunch of other music formats, and
i'm guessing that quicktime will support .svg soon,
if it doesn't already, which can be used for equations,
so none of this is the problem it's been made out as.)

of course, anyone else is free to make their _own_
test-suite, any time they want.   like i told ya earlier,
jon noring is looking for just this type of feedback...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/f36fa206/attachment.html
From traverso at dm.unipi.it  Mon Oct 16 13:23:02 2006
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Mon Oct 16 13:21:11 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <c35.55ba4d2.32653386@aol.com> (Bowerbird@aol.com)
References: <c35.55ba4d2.32653386@aol.com>
Message-ID: <200610162023.k9GKN2N19327@pico.dm.unipi.it>

>>>>> "Bowerbird" == Bowerbird  <Bowerbird@aol.com> writes:

    Bowerbird> classic literature doesn't "get dated", which is why
    Bowerbird> the "crappy old books" that contain it deserve our
    Bowerbird> attention.

    Bowerbird> _math_ books, on the other hand, and most especially
    Bowerbird> their _equations_, don't fall in quite the same
    Bowerbird> category.

    Bowerbird> either those equations have become "classic"
    Bowerbird> themselves, in which case there is little need to do
    Bowerbird> anything more than present them as illustrations, or
    Bowerbird> they have become _dated_ (in light of further
    Bowerbird> developments), in which case there is no need to do
    Bowerbird> anything more than present them as illustrations.  did
    Bowerbird> you notice the symmetry there?

Completely false. A lot of contemporary math research "rediscovers"
XIXth and early XXth century works that have been forgot for 70-100
years or more, and restarts from them.

    Bowerbird> at any rate, i am sure that math people have developed
    Bowerbird> ways to share their work with each other via plain-text
    Bowerbird> e-mail, and -- if the need arises -- i will hear from
    Bowerbird> them how they do that, and incorporate those
    Bowerbird> conventions into zen markup language.

Sure we do. We use TeX (or pseudo-TeX fragments).

{-b\pm\sqrt{b^2-4ac}}\over{2a} for the solutions of a quadratic
equation ax^2+bx+c=0. And if you can read the formula, you can read
its TeX form.

Carlo
From lee at novomail.net  Mon Oct 16 13:31:17 2006
From: lee at novomail.net (Lee Passey)
Date: Mon Oct 16 13:29:38 2006
Subject: [gutvol-d] any more questions, lee?
In-Reply-To: <3b3.9849ea8.32652b21@aol.com>
References: <3b3.9849ea8.32652b21@aol.com>
Message-ID: <4533EC15.5090901@novomail.net>

Bowerbird@aol.com wrote:
>    lee said:
>>   "Said the pieman to Simple Simon,
>>   'Show me first your penny.'"
> 
> so lee, do you have any more questions
> about simple-simon authoring-tools?
> 
> if you do, i'll be happy to answer 'em...

Any /more/ questions? If there was a question implied in my post, it 
didn't get answered, and frankly I don't have any other questions for you.

The only question I have for you is where can I obtain a non-vaporous 
copy of your authoring tool, so I can run it through its paces.

-- 
Nothing of significance below this line.

From joey at joeysmith.com  Mon Oct 16 13:39:42 2006
From: joey at joeysmith.com (joey)
Date: Mon Oct 16 13:46:12 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <c35.55ba4d2.32653386@aol.com>
References: <c35.55ba4d2.32653386@aol.com>
Message-ID: <20061016203942.GC29634@joeysmith.com>

On Mon, Oct 16, 2006 at 03:12:06PM -0400, Bowerbird@aol.com wrote:
> if you guys had anything _substantive_ to say, you certainly would,
> and i would simply say "thanks" and add it to my list of structures...
> 
> as it is, you've now had _years_ to think about it and scour books to
> try and find something that falls outside my list, and you've got zilch.
> 
> and hey, here's a quick piece of advice you might consider:
> when you've got _zilch_, the best thing to do is stay silent...

Again: "Argument from ignorance". Just because I choose not to solve your
problems for you doesn't mean you don't have problems.
From Bowerbird at aol.com  Mon Oct 16 14:48:09 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 14:48:20 2006
Subject: [gutvol-d] any more questions, lee?
Message-ID: <bf0.5c2087f.32655819@aol.com>

lee said:
>    The only question I have for you is 
>    where can I obtain a non-vaporous copy 
>    of your authoring tool, so I can run it through its paces.

ok, that's a good question.

you can't have a copy.

not now, anyway, and probably not for another year or so.

so i suggest you go off and reinvent the wheel yourself...          :+)
because, lee, that's what i really want you to do: waste your time...

but i'm not really sure why you think you need a special authoring tool
to create a .zml file, since any ordinary text-editor would do just fine...

remember, david moynihan converted the entire p.g. _library_ (and more)
into a stunning array of e-book formats.   it's really not that difficult to 
do...

besides, i have pointed you to the markdown dingus.   if you tell me
just exactly why markdown won't serve your purpose, i can help you.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/1970225c/attachment.html
From Bowerbird at aol.com  Mon Oct 16 14:51:52 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 14:52:00 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <522.6631b46.326558f8@aol.com>

joey said:
>    Just because I choose not to solve your problems for you 
>    doesn't mean you don't have problems.

i'm not asking   you to "solve my problems", joey,
i'm saying that unless you tell me what they _are_,
i'm just gonna have to assume that i don't have any.

besides the ones i already know about...          :+)

but if you want other people to know i have problems,
my guess is that you're gonna have to tell those people
what my problems are.   i sure don't have any difficulties
telling people about the problems that _i_ see elsewhere.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/d820eb29/attachment.html
From Bowerbird at aol.com  Mon Oct 16 14:59:11 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 14:59:27 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <269.11b25bc6.32655aaf@aol.com>

carlo said:
>    A lot of contemporary math research "rediscovers"
>    XIXth and early XXth century works that have been forgot 
>    for 70-100 years or more, and restarts from them.

and they will be able to "restart" from an illustration
of the equation just as easily as they've been able to 
"restart" from that same equation in a paper-book...

would it be nice if someone had first done the work
of making that equation (and all the other equations
in that book, and in every other book) _importable_
into today's equation software?   well, sure, i'd guess,
but i would hope that today's mathematicians don't
_expect_ us to do that for them as a matter of course.

and i hope the architects and engineers don't expect us
to make all the diagrams in all the books cad/cam-ready.

and musicians shouldn't expect us to input all the music,
just so it's immediately available to them without any work.

i mean, _sure_, if there are volunteers who _want_ to do
this stuff, then i'm all in favor of it, and i can support it
_just_as_well_as_other_systems,_thank_you_very_much_.

but please, kids, don't hold up your rate of ~2,000 books
digitized per year as supporting your workflow or methods.


>    Sure we do. We use TeX (or pseudo-TeX fragments).

and that's why that's what i'll probably do as well,
when the time comes that i feel that it's necessary,
because that's my modus operandi, to utilize the
existing conventions, to best leverage current work.

but for now, i'm not at all worried about this "problem".

***

now, i have posted how i will handle these issues, so
let's all just move on to something more productive...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/4452d45e/attachment-0001.html
From Bowerbird at aol.com  Mon Oct 16 16:05:48 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 16:05:55 2006
Subject: [gutvol-d] just for the record
Message-ID: <c2a.3335670.32656a4c@aol.com>

for the record, here's the list from the "one-laptop-per-child" people...

>    In simplest terms, a list of our markup requirements is as follows:
>    1. Bold, italic, and monospace text
>    2. Ordered and unordered lists, nested arbitrarily
>    3. Four levels of headings
>    4. Blockquotes
>    5. Internal and external links
>    6. Custom date formatting
>    7. References (reference link style for external URLs)
>    8. Simple tables
>    9. Horizontal rules
>    10. Full extensibility with parser hooks

looks pretty familiar...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/10991950/attachment.html
From cannona at fireantproductions.com  Mon Oct 16 16:50:10 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Mon Oct 16 16:50:16 2006
Subject: [gutvol-d] oh geez, part 3
References: <c35.55ba4d2.32653386@aol.com>
Message-ID: <005901c6f17d$d2c6d590$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've read several important statements in the past day or so which I feel
serve to highlight one of the major issues in this discussion.

Bowerbird wrote on ZML:
"at any rate, i am sure that math people have developed ways
to share their work with each other via plain-text e-mail, and
- -- if the need arises -- i will hear from them how they do that,
and incorporate those conventions into zen markup language."

- From another message, same author:
"i mean, _sure_, if there are volunteers who _want_ to do
this stuff, then i'm all in favor of it, and i can support it
_just_as_well_as_other_systems,_thank_you_very_much_.
...
but for now, i'm not at all worried about this 'problem'."

On the support for complex tables in ZML, same author:
"my own system will eventually be able to handle quite complex tables,
when i find the need to develop it that far.  and if you'd like some proof,
then hand me a list of 100 e-texts that use tables, and i will tackle them
first when the time for 'attacking tables' comes up big on my agenda."


- From Josh when asked what PGTEI can handle:
"Everything you've mentioned AND everything David mentioned (there might be
something in the Early English stuff he mentioned it can't do, though at
least
one of them has been done in PGTEI).
...
I'm too busy actually DOING something. You know, like posting close to 100
books in TEI format to the PG archives. (And before anyone calls me on it,
*I* haven't posted that many; I'm including other people's efforts on TEI in
that "close to 100". There have been 3 other people that have posted books
in TEI format that I know of.)"

A quick look at the PGTEI documentation confirms that pgtei does in fact
have support for embedding LaTex equations.

So, we've got two competing systems.  One of them has been used to publish
several PG etexts, and seems to support complex tables, math, and various
other formatting brought up today.  The other has not been used to produce
nearly so many texts and does not yet support math, complex tables, and a
few other things.  In addition, the maintainer of the latter system
apparently does not feel that math and complex tables are a high enough
priority yet, and wishes to be shown x number of examples of the need for
such before he will add them to his format.

This does not cover all of the arguments for and against each system, not by
a long shot.  However, from the above, it is quite clear, at least to me,
which system it would be most beneficial for Project Gutenberg to adopt.

Sincerely
Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFNBq1I7J99hVZuJcRAjqYAKDotAgDJZpz7ApklVXQZCqbsQ0u+gCgzJ1O
tJbeBThERLdgwxYiB+y5PoY=
=/Gn6
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Mon Oct 16 17:03:54 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 16 17:04:14 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <277.123dcdf7.326577ea@aol.com>

thanks for your "summary", aaron...

to repeat, i don't care one whit what system
project gutenberg "adopts".   indeed, i'd like
for josh to spend a whole _boatload_ of time
making .tei versions of all the books he does.

and as soon as josh is ready to have me take
another close look at the .html and the .pdfs
created by his .tei, i'll be happy to do that too,
using the same criteria they failed at last time.

-bowerbird

p.s.   i really should inform you that it is terribly
simple for me to convert p.g. .txt files to .zml,
so it's not wise of you to do "comparison counts"
between .zml and .tei, because that will bite you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061016/d2db64f1/attachment.html
From cannona at fireantproductions.com  Mon Oct 16 17:13:59 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Mon Oct 16 17:15:24 2006
Subject: [gutvol-d] oh geez, part 3
References: <269.11b25bc6.32655aaf@aol.com>
Message-ID: <008101c6f181$55d42700$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bowerbird wrote:

> carlo said:
>>    A lot of contemporary math research "rediscovers"
>>    XIXth and early XXth century works that have been forgot
>>    for 70-100 years or more, and restarts from them.
>
> and they will be able to "restart" from an illustration
> of the equation just as easily as they've been able to
> "restart" from that same equation in a paper-book...

Just as someone can read a scanned image of a page just as easily as they
can read that page in a paper book, so why OCR and proof read at all?

>
> would it be nice if someone had first done the work
> of making that equation (and all the other equations
> in that book, and in every other book) _importable_
> into today's equation software?   well, sure, i'd guess,
> but i would hope that today's mathematicians don't
> _expect_ us to do that for them as a matter of course.

It depends on your definition of import, but yes, most equations from most
books can be imported, in one form or another into software.  Also, if PG is
going to add a math text to the archive, then it would make sense to have a
standard format that will support it.

Mathematics can be just as much a part of a book as tables can, or any other
type of unusual formatting.  By your logic, one could make the argument:
"would it be nice if someone had first done the work of making that table
(and all the other tables in that book, and in every other book)
_importable_ into today's table parsing software?   well, sure, i'd guess,
but i would hope that today's table readers don't _expect_ us to do that for
them as a matter of course."

In fact, if what you say is true, then, as I mentioned above,  it could be
argued that every page should be left as an image, because readers shouldn't
expect us to do all that work of ocring, proofing and formatting text for
them.


Sincerely
Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFNCCaI7J99hVZuJcRAl+cAJ9GzFpBDAXhRN+Dhuyq5m1cTIbvMQCfXdAP
51pH9mJ0ChLFElkg8qrl7wQ=
=nuJq
-----END PGP SIGNATURE-----

From traverso at dm.unipi.it  Tue Oct 17 05:27:20 2006
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Tue Oct 17 05:25:24 2006
Subject: [gutvol-d] oh geez, part 3
In-Reply-To: <269.11b25bc6.32655aaf@aol.com> (Bowerbird@aol.com)
References: <269.11b25bc6.32655aaf@aol.com>
Message-ID: <200610171227.k9HCRKZ03283@pico.dm.unipi.it>

>>>>> "Bowerbird" == Bowerbird  <Bowerbird@aol.com> writes:

    Bowerbird> --===============1910445001== Content-Type:
    Bowerbird> multipart/alternative;
    Bowerbird> boundary="part1_269.11b25bc6.32655aaf_boundary"


    Bowerbird> --part1_269.11b25bc6.32655aaf_boundary Content-Type:
    Bowerbird> text/plain; charset="US-ASCII"
    Bowerbird> Content-Transfer-Encoding: 7bit

    Bowerbird> carlo said:
    >> A lot of contemporary math research "rediscovers" XIXth and
    >> early XXth century works that have been forgot for 70-100 years
    >> or more, and restarts from them.

    Bowerbird> and they will be able to "restart" from an illustration
    Bowerbird> of the equation just as easily as they've been able to
    Bowerbird> "restart" from that same equation in a paper-book...

    Bowerbird> would it be nice if someone had first done the work of
    Bowerbird> making that equation (and all the other equations in
    Bowerbird> that book, and in every other book) _importable_ into
    Bowerbird> today's equation software?  well, sure, i'd guess, but
    Bowerbird> i would hope that today's mathematicians don't _expect_
    Bowerbird> us to do that for them as a matter of course.

That's exactly what mathematicians (at least, some) are doing. And
surely are are not expecting to have it done without a semantical
markup. OpenMath, MathML, Texmacs, Doyen are some of the names. They
are mostly either TeX based, or can import (carefully written) LaTeX.

Carlo
From j.hagerson at comcast.net  Tue Oct 17 06:49:39 2006
From: j.hagerson at comcast.net (John Hagerson)
Date: Tue Oct 17 06:59:59 2006
Subject: [gutvol-d] Seeking Volunteer who produced e-book 12254
Message-ID: <00a001c6f1f3$1c6e3790$1f12fea9@sarek>

Project Gutenberg e-book 12254 is titled <i>Illustrated History of
Furniture: From the Earliest to the Present Time</i>, the author is
Frederick Litchfield.

I have a request from Brazil to obtain a high-resolution scan of
illustration 48, if it could be made available.

Thank you for your assistance.

John Hagerson


From sam.bretheim at gmail.com  Tue Oct 17 09:37:41 2006
From: sam.bretheim at gmail.com (Sam Bretheim)
Date: Tue Oct 17 09:39:26 2006
Subject: [gutvol-d] just for the record
In-Reply-To: <c2a.3335670.32656a4c@aol.com>
References: <c2a.3335670.32656a4c@aol.com>
Message-ID: <453506D5.1090801@gmail.com>

Several important issues were lost in yesterday's discussion of the 
relative merits of  the PGTEI and ZML production schemes.


First, I haven't seen any suggestion here that PG or DP as a whole 
should adopt a ZML-based workflow, so there's little point to agonizing 
about whether someone working semi-independently is most comfortable 
using ZML.  As long as the result is good-quality XHTML and text, do we 
care whether the post-processor or independent submitter used ZML, TEI, 
Dreamweaver, Word, groff, vi, ed, or a Ouija board?  The point is to get 
the books proofread and distributed.


Second, one of the most important reasons to use "heavyweight" markup 
languages is that they make information about the meaning of a document, 
rather than just its surface structure, available to automatic search, 
browsing, and analysis tools.

For example, extracting bibliographic citations from the presentational 
information available in PDF/PostScript/HTML/text documents is a serious 
chore that the teams at CiteSeer, Google Scholar, IEEE, and numerous 
other organizations have spent many person-years on; the results are 
still terribly error-ridden, and the programs that produce them are full 
of shady heuristic guess-work.  Extracting references (or virtually any 
other meaning) from natural body text, without the formalism of an 
academic paper's References section, is a hard unsolved AI problem. 
However, if the authors of a document supply proper semantic markup, 
such as TEI or BibTeX, getting and analyzing citations from the document 
is trivial.

If reader software understands the citations and allusions in a 
document, it can add clickable links to them, thus making following the 
references a natural and near-instant action.  But links are just the 
easiest thing we can do with software that understands references.  It's 
nearly as easy to add Xanadu-style bidirectional links, so that when 
reading a document we can see which other documents refer to it; by 
counting those references, we can guess the importance of a document.  
These links can be much more powerful than simple HTML-style links, 
because they can encode information about the type of reference: Is it a 
formal citation, an allusion, a quotation, a paraphrase, a 
plagiarization, ...? Is this article a review of that movie? Is the 
review positive or negative?

References are just one fairly simple example of what semantic and 
ontological markup makes possible; properly marked-up equations, tables, 
diagrams, and sheet music can be subjected to all sorts of processing 
that make them even more valuable to society than they were in their 
original contexts.  Considering how time-intensive properly proofreading 
a book is, adding a bit of semantic markup at the post-proofing stage is 
quick and easy.


Third, the visual appeal and simplicity of the source code of an XML 
document is entirely beside the point, because hand-editing XML source 
and hoping it validates is a hackish trick of last resort.  Many people 
do it even now because there are no great affordable XML editors, but 
ultimately it's not the right way to produce a document.


Finally, for those who insist on hand-editing the source of documents, 
markup languages like XML in which the tag and entity syntax is simple 
and well-defined have natural advantages over homegrown approaches 
(MediaWiki, ZML, DP-ish) for which a standard choice of parser 
substitutes for a good, stable, standardized specification document.  
One of Larry Wall's principles of language design is that "easy things 
should be easy, and hard things should be possible".  The fact that TEI 
could do better at easy things is completely outweighed by the fact that 
the homemade solutions make the hard things impossible.  If we were to 
suggest that all new users learn something like ZML (or, by analogy, a 
similarly simplistic piece of software like iMovie or FrontPage), at a 
certain point most of them would grow beyond the capabilities of the 
software, and knowing how to use it would give them no insight 
whatsoever into producing more complex results with more sophisticated 
software; they would need to relearn everything, starting from square one.

Ease of importing of text with conflicting encodings is one advantage: 
natural-language punctuation aside, in XML there are precisely three 
metacharacters to worry about; if you try to paste a mathematical or 
technical expression into a homemade language, you have the same nasty 
metacharacter-escaping problems dealt with by anyone who's tried to 
alter C source files using a bash script assisted by snippets of sed and 
awk.  If the language changes and makes more characters special, porting 
a document to a new version of the language becomes terribly difficult.

But what it really comes down to is extensibility: it doesn't take long 
to run through the 32 non-alphanumeric printable characters in ASCII and 
the few ways you can combine them with the 3 acceptable whitespace 
characters, particularly if you let users of every world language 
punctuate sentences in the fascinating array of ways that they invent.  
If you make a genuine attempt to add all of the semantic and 
presentational functionality that people need to a homemade markup 
language, at a certain point you will need to add increasingly long and 
complex expressions to deal with things that you didn't think of when 
you were designing the language, and while the source code will no 
longer be any easier to read or write than XML, you will still lack the 
large amount of careful, well-reasoned work that's gone into adding 
those features to TEI already, and you'll have nothing like the power of 
XML namespaces.

The quest for "markup without markup" reminds me of Flannery O'Connor's 
"Holy Church of Christ Without Christ": ultimately not an honest or 
achievable aim.  A markup language is a markup language, regardless of 
whether its syntax uses tags between angle brackets or punctuation 
characters with carefully chosen whitespace.  Esoteric programming 
languages like Whitespace demonstrate this fact well.
From Bowerbird at aol.com  Tue Oct 17 11:41:55 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 17 11:42:11 2006
Subject: [gutvol-d] just for the record
Message-ID: <cc8.686f22.32667df3@aol.com>

wow, a thoughtful message, what a breath of fresh air,
a nice change from all the trolls, thank you much, sam.

i'll respond a bit later...           :+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/2cce2f1e/attachment-0001.html
From Bowerbird at aol.com  Tue Oct 17 11:54:56 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 17 11:55:05 2006
Subject: [gutvol-d] oh geez, part 3
Message-ID: <cd3.68b35b.32668100@aol.com>

carlo said:
>    That's exactly what mathematicians (at least, some) are doing. 

and i would expect that, as time goes by, the burgeoning drive
towards simplicity -- which is happening all over cyberspace --
will take root in that work too.   and honestly, until that happens,
there's no need for me to jump in prematurely with my pick on it.
i'll let y'all get things sorted out first, and then cherrypick the best.

since google will be scanning 10-40 million books, there are plenty
of non-math texts for me to work on before i do the math books...


>    OpenMath, MathML, Texmacs, Doyen are some of the names.   They 
>    are mostly either TeX based, or can import (carefully written) LaTeX.

this only makes sense, since tex is so well-established in that world.

it's hard for me as a non-mathematician to know where the line is
between tex as an equation editor and tex as a typesetting tool...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/b0a9e096/attachment.html
From Bowerbird at aol.com  Tue Oct 17 15:09:43 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 17 15:09:54 2006
Subject: [gutvol-d] just for the record
Message-ID: <c69.29c7b92.3266aea7@aol.com>

sam said:
>    First, I haven't seen any suggestion here that PG or DP as a whole
>    should adopt a ZML-based workflow, so there's little point to agonizing
>    about whether someone working semi-independently is most comfortable
>    using ZML

that's basically right.

the situation is a little more complicated than that, because i'm saying
eventually there will indeed be a policy shift to zen markup language,
or something very similar to it, because it will be seen to be superior...

but based on a whole bunch of experience -- that flurry yesterday?,
it used to be like that here _every_day_, i kid you not, it was stupid --
i'm convinced the people here now won't be able to see it very soon.
and even once they can see it, their pride will make them deny it...

but their system will collapse in on itself, assuming it even gets built,
and the people who clean up the mess will rebuild on simplicity instead.

so my aim at present is to mold the p.g. library as my own z.m.l. library;
my ability to then maintain and extend it will serve as a model for others.


>    As long as the result is good-quality XHTML and text, do we care 
>    whether the post-processor or independent submitter used 
>    ZML, TEI, Dreamweaver, Word, groff, vi, ed, or a Ouija board?? 
>    The point is to get the books proofread and distributed.

just a couple of quick points about this...

first, you've begged the question by positing "good-quality xhtml"
as the objective.   _my_ actual objective is "highest-quality e-books."

further, the "point" is not just to get books proofread and distributed,
it's to create a library which (a) is maintained with a minimum of effort,
and (b) gives users maximal power to use the books in a variety of ways.

i don't think you'll disagree with either of those elaborations.
and if you do, i'm pretty sure we can just agree to disagree...


>    Second, one of the most important reasons to use "heavyweight" 
>    markup languages is that they make information about the meaning 
>    of a document, rather than just its surface structure, available to 
>    automatic search, browsing, and analysis tools.

i know that's one of the things that is _promised_ by heavy markup.
but i don't see it being delivered in a meaningful way, not at present.
and i think there will be serious problems with realizing the promise.

in fact, it is my opinion that "invisible" markup has a _better_ chance
of delivering on this promise, as it's better at "staying out of the way",
_plus_ it entails fewer up-front costs and thus is on a faster track to
a better cost-benefit ratio.

the only way that heavy markup can compete is if it can be applied
_programmatically_.   but then, of course, it would be possible to
bake the markup-application code into the end-analysis program,
and eliminate the markup middleman entirely.   (do you grok this?
because it's really the most important argument against markup.)


>   For example, extracting bibliographic citations from the presentational
>    information available in PDF/PostScript/HTML/text documents 
>    is a serious chore that the teams at CiteSeer, Google Scholar, IEEE, 
>    and numerous other organizations have spent many person-years on; 
>    the results are still terribly error-ridden, and the programs that 
>    produce them are full of shady heuristic guess-work.

i understand exactly what you're talking about.

one of the "hidden secrets" about the philosophy of "invisible markup"
is that we have to learn how to make that information _transparent_...

in other words, we need to _format_ our documents so that the info
that we might want to extract from them becomes _obvious_, both to
computers and to humans.   once you understand that this is the key,
you'll realize that it's really not that difficult.   indeed, it's not hard 
at all.

contrast this with the heavy-markup approach, where modus operandi
is to _label_everything_.   that method will "work", yes it certainly _will_,
since ambiguity is reduced to a minimum if you've labeled everything,
but the labeling process is extremely costly.   if we consider all the text
that we have out in the world, the idea of labeling it all is preposterous.

even if it _was_ possible, it would quickly become absolutely unworkable,
since the labels would soon be tripping over themselves, and the actual
_content_ would quickly become completely obscured by all the labeling.

and yes, it's gonna take a little while before mankind realizes all of this.
and we'll be misled by mid-stream "successes" where markup "works"
before it collapses on itself.   but it won't take long for confusion to 
reign.

and -- perhaps more importantly -- since the _costs_ have to be paid
up-front, before any of the benefits even start to accrue, the probability
that people will be suckered into doing markup is _greatly_ diminished.

meanwhile, a movement toward _simplicity_ has already begun in earnest.

ordinary people don't want to be bothered with doing markup, so we have
begun inventing ways to have to applied automatically, in the form of z.m.l.
and markdown, textile, wiki-formatting, plus a myriad of others to follow...

people have also become allergic to "feature creep", so now the innovators
(like, say, 37signals) have made "simplicity" the key _feature_ in their 
apps...

and there's no turning back of this momentum.   who needs complexity?

the technoids who've been trying to shove complexity down our throats
-- whether because they personally prefer _difficulty_, or because they
see our _helplessness_ as the key to their economic future, or _both_ --
are going to fail.   i repeat: they are going to fail.   it's really that 
simple...


>    Extracting references (or virtually any other meaning) 
>    from natural body text, without the formalism of an
>    academic paper's References section, is a hard unsolved AI problem.

right.   and that problem will probably _never_ be solvable.

but did you notice that, _with_ the "formalism" (as you have put it) of
a "references" section, the problem reduces to an utterly easy task?

now, once you understand that what you call "formalism" is what i call
"transparency", then you'll have a good understanding of what i mean.

and material in that "references" section does _not_ have to be labeled
with heavy x.m.l. markup for it to be effectively and efficiently parsed...

a good example of this is "zotero", dan cohen's new firefox plug-in.
it "senses" when a web-page has cataloging-type information on it,
and will automatically capture that data and save it in your database,
so you can search it later, bundle it into your own papers, and so on.


>    However, if the authors of a document supply proper semantic markup,
>    such as TEI or BibTeX, getting and analyzing citations from the document
>    is trivial.

the focus of heavy-markup advocates is always on the _benefit_ of the
"trivial" retrieval, with the substantial _cost_ of the initial encoding 
being
waved off as if it happens by magic.   unfortunately, this makes for great
_snowjobs_by_salesmen_.   fortunately, once people have been suckered
by this "free lunch" promise several times, they _will_ wise up eventually.

once they find out that markup doesn't deliver what it had promised,
the scenario will be "you need more markup, and better markup, and
-- oh, by the way -- that more better markup will also cost you extra",
and the snake-oil salesman will be booted out the door.   end of story.


>    If reader software understands the citations and allusions
>    in a document, it can add clickable links to them, thus making 
>    following the references a natural and near-instant action.? 

right.   and this is what we expect from an electronic document.

but what we really want is to have those links made automatically,
instead of having to code them all manually.   that's too much work.


>    But links are just the easiest thing we can do with software 
>    that understands references.? It's nearly as easy to add 
>    Xanadu-style bidirectional links, so that when reading 
>    a document we can see which other documents refer to it; 
>    by counting those references, we can guess the importance 
>    of a document.? These links can be much more powerful 
>    than simple HTML-style links, because they can encode 
>    information about the type of reference: Is it a formal citation, 
>    an allusion, a quotation, a paraphrase, a plagiarization, ...? 
>    Is this article a review of that movie? Is the review positive or 
negative?

again, cool stuff to have.   but a tremendously expensive pain to code,
if done manually.   so find a way to apply the markup programmatically.

and then, once you've found the way to apply markup programmatically,
just put that code into the program that _your_system_ would eventually
have _interpreting_ that markup.   you'll find that you have saved yourself
enough time and energy by not having to write the markup-interpretation
routines that your effort in programmatic-understanding pays for itself...

and this is the "secret sauce" of my approach, sam, that we put intelligence
into our _programs_ instead of into our _markup_, because the intelligence
in _programs_ can be applied to new content without any additional work,
whereas with a markup approach that new content must be marked up first.
even if markup establishes a toehold, this ongoing expense will kill it 
off...


>    properly marked-up equations, tables, diagrams, and sheet music 
>    can be subjected to all sorts of processing   that make them 
>    even more valuable to society than they were in their original contexts.

the question as to whether they are "more valuable" needs to include
an accurate assessment of the cost of applying that "proper" markup.

there might well be more benefits after application of "proper markup",
but if the _costs_ of applying that markup outweigh the added benefits,
it is unwise to go down that path.   you might not _like_ that conclusion,
but from the entirely reasonable perspective of cost-benefit analysis,
sam, it's simply not a conclusion that you can avoid.   sorry about that...


>    Considering how time-intensive properly proofreading a book is, 
>    adding a bit of semantic markup at the post-proofing stage is
>    quick and easy.

but if we can get the same benefits _without_ that markup,
then there's simply no good reason to apply it, is there?

even if we get _almost_ the same benefits, if the application of
markup is expensive (as it will be), cost-benefit says don't-do.


>    Third, the visual appeal and simplicity of the source code of an XML
>    document is entirely beside the point, because hand-editing XML source
>    and hoping it validates is a hackish trick of last resort.? Many people
>    do it even now because there are no great affordable XML editors, but
>    ultimately it's not the right way to produce a document.

at least you are honest about the stupidity of editing x.m.l. source,
and the fact that there are no great affordable x.m.l. editors.   kudos!

anyway, as i said at the top, the cost of _maintaining_ a cyberspace library
is one of the most important considerations that we need to keep in mind.

and it's just one more arena where will z.m.l. will shine...


>    at a certain point most of them would grow beyond the capabilities
>    of the software, and knowing how to use it would give them no insight
>    whatsoever into producing more complex results with more sophisticated
>    software; they would need to relearn everything, starting from square 
one.

you've made the natural assumption that a "lightweight" markup will have
a capability-ceiling that is deficient when compared to a "heavyweight" one.

i can't say it with definite authority quite yet, but i am highly suspect 
that this
"natural" assumption will prove to be greatly misleading in the case of 
z.m.l.

i can say with much confidence that for 95% of the project gutenberg e-texts,
there will be _no_ difference in performance capability between z.m.l. and 
t.e.i.

and the number might well jump to 97%, or even 99%, or even 99.5%.

again, equivalent benefits with much lower costs is a brain-dead decision...


>    in XML there are precisely three metacharacters to worry about

metacharacters are a pain.   so i've tried to think outside that box.
we'll see in the long term how well i have succeeded.


>    If you make a genuine attempt to add all of the semantic 
>    and presentational functionality that people need to a 
>    homemade markup language, at a certain point you will 
>    need to add increasingly long and complex expressions 
>    to deal with things that you didn't think of when you were 
>    designing the language, and while the source code will no
>    longer be any easier to read or write than XML, you will still 
>    lack the large amount of careful, well-reasoned work that's 
>    gone into adding those features to TEI already, and you'll 
>    have nothing like the power of XML namespaces.

that's a rather bleak if:then statement you've laid out, sam, so
let's see if i can short-circuit it at the start, so as not to have to
worry about it, ok?   do you see any features common to books
that i've left out of my test-suite?   if so, do please let me know...
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.html
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/test-suite.zml


>    The quest for "markup without markup" reminds me of 
>    Flannery O'Connor's "Holy Church of Christ Without Christ": 
>    ultimately not an honest or achievable aim.? 

whether it is "achievable" is an open question.
since i'm the one putting in the work, though,
i assume you don't mind how i spend my time.

as to whether this is an "honest" aim, well...
i suppose all the technoids who were hoping
to make a cushy living from x.m.l. consulting
will feel that i've tricked them "unfairly", but
i see myself as saving the human race from a
big bunch of needless and costly complexity.


>    A markup language is a markup language

well, yes, it is...


>    A markup language is a markup language,
>    regardless of whether its syntax uses tags 
>    between angle brackets or punctuation
>    characters with carefully chosen whitespace.?

...but some markup languages are more beautiful
than others, and -- at least in my humble opinion --
the invisible ones are the most beautiful of all...         :+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061017/8df7f350/attachment-0001.html
From Bowerbird at aol.com  Wed Oct 18 04:11:03 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 04:11:10 2006
Subject: [gutvol-d] since lee doesn't have any more questions
Message-ID: <c6f.2a71937.326765c7@aol.com>

ok, lee, so as i said the other day, i want you to waste your time
reinventing the wheel, by writing a program i've already written
(which you seem to be fond of insinuating is mere vapor)...

the thing is, i want you to hurry up and write it, so we can move on
to the next step after that.   it's 2006 already, and time is a-wasting.

to that end, i'm willing to advise you in the writing of your version.

we can make it open-source -- i'll direct you, and you'll program.

so each day, i'll give you a little assignment for a routine to write
-- i'll even give you the pseudo-code for it -- and then when you
come back with the routine finished, we'll go on to the next one...

if you need some additional assistance after the pseudo-code,
i'll provide you some source written in basic (easy to understand).
if you still need more help after that, i'll give you the code in perl.
of course, you'll always get the correct answer, to verify your work.
given direction, examples, and answers, i'm sure you will succeed.

you -- of course -- can write the program in any language you like.
(and you can make it a web-based program, offline application, or
both, whichever you prefer and are capable of, makes no difference,
as it is the discipline of writing the code that will be your main gift.)

and the lurkers out there can download it from wherever you put it,
and run it on their machines, to beta-test it, and give their feedback
to us on any bugs.   hey, it'll be a nice little group project, lots of 
fun...

within just a few weeks time, we'll build from the very beginning up to
a solid program that acts capably, and we'll all be happy and amazed...
(time-wise, outside, even slow/steady progress will finish in 6 weeks.)

at the same time, if you like, we can also work from the end to the start,
by reverse-engineering some of the e-books i've already placed online.

***

for our content, we'll use a book with which you might be familiar...

>    http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml

so this is your input file, the .zml "master version" that generates others.

***

here, let me repeat that url for you, because it's important...

>    http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml

whether you input the text into your program from the web each time,
or from a saved-file version on the local machine, is (again) up to you.

and yes, this _is_ "my antonia", digitized by jose menendez, jon noring,
and a flock of others.   under my direction, you'll write a program that
will turn this nifty .zml version of the book into some solid .html files.
then we can convert that .html to a wide variety of our e-book formats.
i will also show you how to write routines to get a nice .pdf of the text.

all from a measly "lightweight" file in z.m.l. -- "zero markup language",
the "virtually invisible" markup that's 2 steps more advanced than x.m.l.

and before you know it, you'll have the wondrous program that you seek.

if you need to support any more formats after that, we can talk about it.
but for the most part, if i can't convert to it from .html, i ain't 
interested.

***

it'll be a very simple assignment you get every day, so it won't take much
work to get it done.   and you'll see regular progress, so it'll be 
rewarding.


and if anyone else wants to jump in, please feel free!   it's a group 
project!
a nice group hug, which we all need after the recent gutvol-d antipathy...

and hey, wouldn't it be _neat_ to have this program being simultaneously
developed in multiple languages, such as perl and python, php and ruby!
let's have some flash, experiment with ajax, play with abuncha cool hacks.

87 different tools ordinary people can use to convert our electronic-books.
just try and tell me that wouldn't be cool; go ahead fool, try to persuade 
me.

and quick, someone tell david rothman we'll solve his tower of babel with
universal translation tools that let every book be understood by everyone.

heck, to honor douglas adams, we might even call it "babelfish"...

"we don't need no education... teachers leave those kids alone..."

***

so here we go!

today's assignment:
1.   write the shell of a program that reads the text-file and then
writes it out as a simple .html web-page, formatted with [pre].

(note: in these assignments, i will use square brackets instead of 
the traditional .html angle-brackets, to eliminate any confusion.)

that's right, just a straight read and then a straight write.   easy.
nothin' fancy, just "preformatted".   read input and write output.

pseudo-code:
1a.   read file.
1b.   prepend .html header-info.
1c.   display as web-page.

***

extra-credit assignment:
1x.   lines with double-curly-braces give the scan-filename and
the running head for each page; since some readers will want to
eliminate these to rejoin the text's normal flow, write a routine
tagging each line (a) followed by a line containing curly-braces,
(b) or _is_ a line that contains curly-braces, (c) or is preceded by
a line containing curly-braces.   in other words, you will eliminate
each line with curly-braces, _and_ each line above it and below it.

all the untagged remaining lines -- the book in its normal flow -- 
will be written to a new file, for the user's uninterrupted enjoyment.

pseudo-code:
1xa.   read the lines of the file into an array.
1xb.   step through array marking appropriate lines for deletion.
1xc.   write all unmarked lines into a new file.

like i said, all of the routines will be simple.   they won't all be
as simple as _these_ routines -- which are _very_ elementary,
we're jus' startin' slow, don't be offended they are so easy --
but we'll still have lots of fun, yes we will...           :+)

-bowerbird

p.s.   if you want to purse the reverse-engineering angle,
the .html files that were the end-result of my program are
available at the base url which you can determine from the cover:
>    http://www.greatamericannovel.com/myant/myantc001.html

and proceeding through the forward-matter:
>    http://www.greatamericannovel.com/myant/myantf001.html

and then the pages themselves:
>    http://www.greatamericannovel.com/myant/myantp001.html

page 123, for instance, can be seen at:
>    http://www.greatamericannovel.com/myant/myantp123.html

if you can write a program _in_one_week_ that outputs the series
of .html files represented, you will receive high honors in this class,
and an automatic promotion to the advanced version of this school.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/01f17dad/attachment.html
From Bowerbird at aol.com  Wed Oct 18 04:30:40 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 04:30:55 2006
Subject: [gutvol-d] re: since lee doesn't have any more questions
Message-ID: <cdb.77c7de.32676a60@aol.com>

i said:
>    "we don't need no education... teachers leave those kids alone..."

yes, of course i know it's "them kids", not "those kids", it's just that
i was playing with google to do a comparison count of the forms -- 
>    the correct "leave them kids alone" = 18,600,
>   the incorrect "leave those kids alone" = 11,100
>   the incorrect "leave the kids alone" = 10,600
>   the unique "leave us kids alone" = 196
>   the rarest "leave these kids alone" = 9
("yay", ask-the-audience gives right answer again, wisdom of crowds)
-- and somehow at 4:20 in the morning i posted the wrong version.

so sue me...        ;+)

-bowerbird

p.s.   speaking of which, google considers youtube (and itself) to be
totally immune to any copyright concerns, under the "safe harbor"
shield, because they immediately take down once they get a notice,
which is all you have to do to rescue yourself from a suit under dmca.
kinda funny how easy it is to make yourself immune, isn't it?, especially
since it doesn't really matter _how_ dirty you were right up to that point.
youtube, for instance, is dirty as can be.   and yet washed free immediately.
very very interesting...


"hey teacher leave" = 41,400
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/4300fd04/attachment.html
From marcello at perathoner.de  Wed Oct 18 05:02:22 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Oct 18 05:02:26 2006
Subject: [gutvol-d] just for the record
In-Reply-To: <c69.29c7b92.3266aea7@aol.com>
References: <c69.29c7b92.3266aea7@aol.com>
Message-ID: <453617CE.9000605@perathoner.de>

Bowerbird@aol.com wrote:

> _my_ actual objective is "highest-quality e-books."

And after 3 1/2 years of development your _status_ is:

  "This page is not Valid HTML 4.01 Transitional!"


http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp001.html


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Wed Oct 18 10:10:51 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 10:10:59 2006
Subject: [gutvol-d] just for the record
Message-ID: <304.1059097e.3267ba1b@aol.com>

marcello said:
>    And after 3 1/2 years of development your _status_ is:
>    "This page is not Valid HTML 4.01 Transitional!"

oh no!   i've not been validated!   my life is so shallow!       :+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/b1b8bf7e/attachment.html
From Bowerbird at aol.com  Wed Oct 18 10:34:47 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 10:34:58 2006
Subject: [gutvol-d] re: my shallow life
Message-ID: <235.10ff3d00.3267bfb7@aol.com>

distraught, i sought out the source of my invalidity:
>    Error Line 28 column 16: 
>    an attribute value must be a literal unless it contains only name 
characters. 
>    <font color=rgb(222,99,99)>[[1]]</font><br>

oh my word.   how could i ever right this terrible wrong?
existence is such a sad tragedy, is it not?   alas, i despair.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/0e91de23/attachment.html
From Bowerbird at aol.com  Wed Oct 18 11:16:25 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 11:16:32 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <bf6.76d7d8f.3267c979@aol.com>

rescued from the desert of invalidity and refreshed to face another day,
i remind myself of the old man in the monty python movie who insists
"i'm not dead yet", and who is then, of course, smashed over the head...

http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan
tp001.html

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/c1bd0214/attachment-0001.html
From hyphen at hyphenologist.co.uk  Wed Oct 18 11:21:08 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Wed Oct 18 11:21:25 2006
Subject: [gutvol-d] re: my shallow life
In-Reply-To: <235.10ff3d00.3267bfb7@aol.com>
References: <235.10ff3d00.3267bfb7@aol.com>
Message-ID: <umrcj2hfmrm4pct1gpepi9r4bk2jqb9eg7@4ax.com>

On Wed, 18 Oct 2006 13:34:47 EDT,  Bowerbird@aol.com wrote:

|distraught, i sought out the source of my invalidity:
|>    Error Line 28 column 16: 
|>    an attribute value must be a literal unless it contains only name 
|characters. 
|>    <font color=rgb(222,99,99)>[[1]]</font><br>
|
|oh my word.   how could i ever right this terrible wrong?
|existence is such a sad tragedy, is it not?   alas, i despair.


After chasing down an html problem intermittently for a few days I hate the
validation suite as well, it does not give the definitive explanation of
what was wrong as one would expect of modern software.   Just everlasting
complaints about stuff downstream/after of the real problem.
*************************
***HATE HATE HATE HATE***
*************************

*******************
***CRAP SOFTWARE***
*******************
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From desrod at gnu-designs.com  Wed Oct 18 11:27:24 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Wed Oct 18 11:32:10 2006
Subject: [gutvol-d] re: my shallow life
In-Reply-To: <umrcj2hfmrm4pct1gpepi9r4bk2jqb9eg7@4ax.com>
References: <235.10ff3d00.3267bfb7@aol.com>
	<umrcj2hfmrm4pct1gpepi9r4bk2jqb9eg7@4ax.com>
Message-ID: <Pine.LNX.4.64.0610181425540.8243@aphrodite.gnu-designs.com>


> |>    <font color=rgb(222,99,99)>[[1]]</font><br>

> |oh my word.  how could i ever right this terrible wrong? 
> |existence is such a sad tragedy, is it not?  alas, i
> |despair.

Font tags are being deprecated, unnecessary, evil, wastes bandwidth 
and should be left to gather dust in the corner..


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From marcello at perathoner.de  Wed Oct 18 11:50:26 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Oct 18 11:50:30 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <bf6.76d7d8f.3267c979@aol.com>
References: <bf6.76d7d8f.3267c979@aol.com>
Message-ID: <45367772.3090206@perathoner.de>

Bowerbird@aol.com wrote:

> rescued from the desert of invalidity and refreshed to face another day,
> i remind myself of the old man in the monty python movie who insists
> "i'm not dead yet", and who is then, of course, smashed over the head...
> 
> http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan
> tp001.html


http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html

  "This page is not Valid HTML 4.01 Transitional!"


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Wed Oct 18 12:06:21 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:06:31 2006
Subject: [gutvol-d] re: my shallow life
Message-ID: <bee.6a9d7c3.3267d52d@aol.com>

david said:
>    Font tags are being deprecated, unnecessary, 
>    evil, wastes bandwidth and should be 
>    left to gather dust in the corner..

it was bad enough when my existence was meaningless,
but now my work has been characterized as "evil".   lordee!

how can i recover from this severe blow to my self-esteem?         :+)

-bowerbird

p.s.   i think you'll find some c.s.s. in those .html templates,
and yes, since that particular bit of colorization was for the
pagenumber, i agree that it should be tagged more clearly.

p.p.s.   dave, the best way to think about the validator is as
an idiot-savant, who knows when things are "wrong", but
gets tongue-tied and can't really tell you _why_.   if you think
of the poor thing in this way, you'll be able to forgive its autism.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/0cffc00a/attachment.html
From desrod at gnu-designs.com  Wed Oct 18 12:12:02 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Wed Oct 18 12:11:15 2006
Subject: [gutvol-d] re: my shallow life
In-Reply-To: <bee.6a9d7c3.3267d52d@aol.com>
References: <bee.6a9d7c3.3267d52d@aol.com>
Message-ID: <Pine.LNX.4.64.0610181510240.9552@aphrodite.gnu-designs.com>


> it was bad enough when my existence was meaningless, but now my work 
> has been characterized as "evil".  lordee!

 	Not your work, your implementation.

> how can i recover from this severe blow to my self-esteem?  :+)

 	Ignore it and move on to bigger and better things.

> p.p.s.  dave, the best way to think about the validator is as an 
> idiot-savant, who knows when things are "wrong", but gets 
> tongue-tied and can't really tell you _why_.  if you think of the 
> poor thing in this way, you'll be able to forgive its autism.

 	Not sure who "Dave" is, but I'll respond: The validator told 
you exactly "why" it was wrong, now its up to you to figure out how to 
fix it. The validator isn't mystical or confusing at all, if you read 
the errors and warnings it reports.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From Bowerbird at aol.com  Wed Oct 18 12:12:08 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:12:13 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <cbc.10e19d7.3267d688@aol.com>

marcello said:
>   
http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan
tp002.html

there are 428 pages in this book.

are you going to point to this same "error" on every one?

because, you know, that might actually be kind of _fun_...

so -- in the spirit of distributed proofreaders and "a page a day" --
i'll correct page 2 tomorrow, whereupon you can point to page 3,
and then the next day i'll correct page 3, so you can point to page 4,
and the day after that i'll fix page 4 and you will then point to page 5,
and along about the start of 2008, we'll be all finished with this book!
then we can start on another one.   come on, marcello, it'll be lots of fun!

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/e6c42aa8/attachment.html
From desrod at gnu-designs.com  Wed Oct 18 12:17:36 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Wed Oct 18 12:17:18 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <45367772.3090206@perathoner.de>
References: <bf6.76d7d8f.3267c979@aol.com> <45367772.3090206@perathoner.de>
Message-ID: <Pine.LNX.4.64.0610181513430.9552@aphrodite.gnu-designs.com>


> http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html

>  "This page is not Valid HTML 4.01 Transitional!"

The error is obvious. If you must use these broken tags, use them as 
follows:

 	<font color="rgb(222,99,99)">[[2]]</font><br />


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

From desrod at gnu-designs.com  Wed Oct 18 12:22:04 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Wed Oct 18 12:21:19 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <cbc.10e19d7.3267d688@aol.com>
References: <cbc.10e19d7.3267d688@aol.com>
Message-ID: <Pine.LNX.4.64.0610181517530.9552@aphrodite.gnu-designs.com>


> there are 428 pages in this book. are you going to point to this 
> same "error" on every one?

> so -- in the spirit of distributed proofreaders and "a page a day" 
> -- i'll correct page 2 tomorrow, whereupon you can point to page 3,

 	Here, let me fix page 2 through 428 for you in one shot:

    perl -pi.$$ -e \
      's|color=rgb\(222,99,99\)|color="rgb\(222,99,99\)"|g' \
      `find . -type f -name '*html'`

 	(all on one line, of course)

 	Now let's get back to doing real work, instead of bickering 
about doing it.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From Bowerbird at aol.com  Wed Oct 18 12:22:41 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:22:50 2006
Subject: [gutvol-d] re: my shallow life
Message-ID: <cc6.81444c.3267d901@aol.com>

david said:
>    Not sure who "Dave" is, but I'll respond:

evidently, you didn't read the post from dave falwthrop;
he was the one complaining about the validator output.

i've been stumped at times in the past, but i found this
quite easy to understand, since i had used the realbasic
method of specifying color -- rgb(rrr,ggg,bbb) -- and
not the html method.   as i'm sure you know quite well,
that kind of transference error happens a lot when you
jump from one environment into another.   fact of life...

anyway, david, your taking all of this seriously is putting
a real crimp on the humor of my mockery...             ;+)

and bottom line, all of these .html files will eventually be
spit out on demand by a script, which can be changed
on a whim, so any particular manifestation of the .html
is so transitory as to be unworthy of examination.   really.

just so's you know...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9f063b99/attachment.html
From Bowerbird at aol.com  Wed Oct 18 12:25:03 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:25:18 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c71.2b0fefc.3267d98f@aol.com>

david said:
>    The error is obvious.

um, gee, thanks david, thanks a lot, i really appreciate it.

-bowerbird

p.s.   [heavy sigh]          ;+)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9f8228d7/attachment.html
From Bowerbird at aol.com  Wed Oct 18 12:26:48 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:26:53 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <cda.815bdd.3267d9f8@aol.com>

david said:
>    Here, let me fix page 2 through 428 for you in one shot:

you know, somebody always has to ruin the fun...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/9012f778/attachment.html
From Bowerbird at aol.com  Wed Oct 18 12:40:59 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 12:41:09 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
Message-ID: <c84.535212f.3267dd4b@aol.com>

meanwhile, i am really excited about this "babelfish" project!

of course, lee hasn't responded yet, but i say, i just can't wait!

this is gonna be _so_ much fun!   it's difficult to _contain_ myself!

so even though i'm not _supposed_ to give a hint, via basic code,
until the day _after_ an assignment, i simply _must_ jump the gun.

so here's realbasic code for an app that loads in the text-file from
your hard-drive and displays the text proudly in a scrolling editfield.

>    dim f as folderitem
>    f=getfolderitem("myant.zml")
>    if f<>nil and f.exists then f.openstylededitfield editfield1

that's it, folks!   that's all it takes to load a text-file and display it!
so we've accomplished the first day's lesson!   pretty simple, eh?

see, i told you this was gonna be fun!   i feel validated already!

in fact, that was so easy, let's do the extra-credit one too, ok?
   
>    dim tl(-1) as string rem -- tl (for "the lines") is an array
>    dim x as integer
>    tl=split(editfield1.text,chr(13)) rem -- read text into array
>    tl.insert(0,"") rem -- do a little housekeeping for clarity
>    tl.append("") rem -- a little more housekeeping for clarity
>    for x=1 to ubound(tl)-1
>      if instrb(1,tl(x+1),"{{")>0 then tl.remove(x) rem -- delete preceding
>    next x
>    for x=2 to ubound(tl)
>      if instrb(1,tl(x-1),"{{")>0 then tl.remove(x) rem -- delete following
>    next x
>    for x=1 to ubound(tl)
>      if instrb(1,tl(x),"{{")>0 then tl.remove(x) rem -- delete curly-braces
>    next x
>    editfield1.text=join(tl,chr(13)) rem put the new data back in field

wow, that was easy too.

and we're all done for the day.   how about a cold beer?

but wait a minute here.   the output shows us the assignment was
flawed in a certain respect, and will need to be respecified slightly.

that's ok, that's one of the good things that code does, it forces you
to _be_specific_and_precise_ with specifications -- to be _exact_ --
and that's a very good way to be, an _excellent_ way to be, yes sir...

yes indeed, a half-hour of coding can help you hone your thinking
better than three-and-a-half months of listserve posts.   seriously!

so, can you tell me what the flaw was in the original specification?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/983be7c2/attachment-0001.html
From marcello at perathoner.de  Wed Oct 18 15:02:58 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Oct 18 15:03:03 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <cbc.10e19d7.3267d688@aol.com>
References: <cbc.10e19d7.3267d688@aol.com>
Message-ID: <4536A492.80200@perathoner.de>

Bowerbird@aol.com wrote:

> marcello said:
>>   
> http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myan
> tp002.html
> 
> there are 428 pages in this book.
> 
> are you going to point to this same "error" on every one?

Are you going to fix the error on page 1 and hope the other 427 pages
fix themselves?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Wed Oct 18 15:26:46 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Oct 18 15:26:51 2006
Subject: [gutvol-d] re: my shallow life
In-Reply-To: <cc6.81444c.3267d901@aol.com>
References: <cc6.81444c.3267d901@aol.com>
Message-ID: <4536AA26.301@perathoner.de>

Bowerbird@aol.com wrote:

> and bottom line, all of these .html files will eventually be
> spit out on demand by a script, which can be changed
> on a whim, so any particular manifestation of the .html
> is so transitory as to be unworthy of examination.   really.


The funny thing here is that while you are giving us an endless song and
dance about your "highest quality ebooks"

> _my_ actual objective is "highest-quality e-books."

it is quite evident that you have implemented no QA processes whatsoever
or you would have caught so simple an error before posting 428 files
containing that error.

Bowerbird "uality" ebooks. I'm sure!


And don't get me started about your complete lack of WAI accessibility
features while your user interface is so ugly that only a blind person
might actually want to use it.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Wed Oct 18 18:15:48 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 18:15:55 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c3a.592837e.32682bc4@aol.com>

marcello said:
>   Are you going to fix the error on page 1 
>    and hope the other 427 pages fix themselves?

no!   _i_ am gonna fix each page, by hand, one per day,
assuming that i receive an error-report from you on it...

tomorrow i fix page 2.   oh yes, this will be so much fun!


>    it is quite evident that you have implemented no QA processes whatsoever
>    or you would have caught so simple an error before posting 428 files
>    containing that error.

"that error" -- as you put it -- means two things, and two things only.
1.   the pagenumber is black instead of some other color.
2.   the .html doesn't "validate".

neither of those things is of the _slightest_ concern to me.   sorry.

indeed, in fact, if memory serves me correctly, i left "that error" in
on purpose, precisely because it caused the page not to validate,
and i knew that would be upsetting to you technoid wieners here.

and yes, i realize that it's a smallish cruelty of sorts to poke the o.c.d.
tendencies of some people here for something so fully insignificant,
and i hope people forgive me, as i feel badly about that, i _really_ do...
(no i don't.   yes i do.   no i don't.   yes i do.   no i don't.   yes i do.  
 no i don't.)


>    And don't get me started about 
>    your complete lack of WAI accessibility features
>    while your user interface is so ugly that 
>    only a blind person might actually want to use it.

nah, the blind people will prefer the plain-text .zml version,
since there isn't any crap in there to gunk up screen-readers.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/c96317cc/attachment.html
From Bowerbird at aol.com  Wed Oct 18 18:22:01 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Oct 18 18:22:11 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <556.9857302.32682d39@aol.com>

dang, i know i'm supposed to wait, but i'm just so excited, i can't stop 
myself!

>    
http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp002.html

page 2 validates!   woo-hoo!

your move, marcello!

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061018/d084c16b/attachment.html
From lee at novomail.net  Wed Oct 18 20:15:15 2006
From: lee at novomail.net (Lee Passey)
Date: Wed Oct 18 21:03:25 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <c84.535212f.3267dd4b@aol.com>
References: <c84.535212f.3267dd4b@aol.com>
Message-ID: <4536EDC3.5060403@novomail.net>

Bowerbird@aol.com wrote:
>  meanwhile, i am really excited about this "babelfish" project!
>
>  of course, lee hasn't responded yet, but i say, i just can't wait!

Heavens, please don't wait for me! I'll admit that Mr. Noring's 
suggestion of creating an authoring tool that is word processor-like and 
WYSIWYGy, that would permit author's to create e-books in a 
user-friendly environment, that would save documents in a powerful and 
useful format, and would convert to multiple existing e-book formats at 
the click of a button, is a very intriguing programming project. On the 
other hand, learning your approach to converting ZML to HTML is of 
virtually no interest to me at all.

If, however, you were to create an instance of the aforementioned 
authoring tool (which is what I thought you were claiming) and it only 
saved its work in ZML, then maybe I /would/ take the hour or two it 
would require to create a program to convert a ZML file into something 
useful. Or if you were to clean up several thousand of the PG files 
which are not currently available in HTML and make them available in 
ZML. But currently there is no compelling reason to pay any attention 
whatsoever to ZML.

From hyphen at hyphenologist.co.uk  Wed Oct 18 23:39:28 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Wed Oct 18 23:39:43 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <c84.535212f.3267dd4b@aol.com>
References: <c84.535212f.3267dd4b@aol.com>
Message-ID: <m57ej2hruc3r9b2fjnu4ago4vq89cg7s2i@4ax.com>

On Wed, 18 Oct 2006 15:40:59 EDT,  Bowerbird@aol.com wrote:

|meanwhile, i am really excited about this "babelfish" project!
|

babelfish has been going for years, and I have used it for years.
The translations given are absolutely awful.  At best they are a first pass
for a human translator, or give you  a general idea  of what the text is
about.
-- 
Dave Fawthrop <hyphen@hyphenologist.co.uk> 

From Bowerbird at aol.com  Thu Oct 19 00:35:59 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 00:36:08 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
Message-ID: <c10.7616f00.326884df@aol.com>

lee said:
>    I'll admit that Mr. Noring's suggestion of creating an authoring tool 
>    that is word processor-like and WYSIWYGy, that would permit author's

i take it you mean "authors" -- the plural -- not the possessive...

>    I'll admit that Mr. Noring's suggestion of creating an authoring tool 
>    that is word processor-like and WYSIWYGy, that would permit author's
>    to create e-books in a user-friendly environment, that would 
>    save documents in a powerful and useful format, and would 
>    convert to multiple existing e-book formats at the click of a button, 
>    is a very intriguing programming project. 

that's what we're doing here, lee!

the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is
simple enough when the file-format is something as simple as .zml.

further, z.m.l. is that "powerful and useful format" that you mention,
precisely because it will do what you say you want to do, which is to
create files across a variety of e-book formats.

because once we morph the .zml file to .pdf and especially (x)html,
it's a simple matter of routing that .html to the converter programs.
(any e-book format that can't accept .html as input is a non-starter.)


>    On the other hand, learning your approach to converting ZML 
>    to HTML is of virtually no interest to me at all.

gee, lee, you're talking out of both sides of your mouth here.

first you say "it's a very intriguing programming project",
and then you say it is "of virtually no interest to me at all"...

which is it lee?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/fc228f12/attachment.html
From schultzk at uni-trier.de  Thu Oct 19 02:07:47 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Thu Oct 19 02:07:53 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <c10.7616f00.326884df@aol.com>
References: <c10.7616f00.326884df@aol.com>
Message-ID: <A2D00D2F-F826-48C2-A1CF-37BE89A99CB2@uni-trier.de>

Hi There,

	There are already such tools availible commercially!!

	It has been around for a long time TeX and LaTeX.
	Textures is a WYSIWYG system and authouring Tool.

	LaTeX can be easily converted to pdf, html, xml, docbook, etc.

	As Bowerbird mentioned in another thread why reinvent the wheel
	or try to.

	Just my two Euro cents worth!

		reagards
			Keith.

P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics,  
footnotes, layout control, indices, bibliographie,
       multi-language, right-left, left-right., ASCII, UniCode,  
pratically platform independent.
       There are freeware versions, but they are generally not  
WYSIWYG, thereby having at first a stiff learning curve.

		Keith.

Am 19.10.2006 um 09:35 schrieb Bowerbird@aol.com:

> lee said:
> >   I'll admit that Mr. Noring's suggestion of creating an  
> authoring tool
> >   that is word processor-like and WYSIWYGy, that would permit  
> author's
>
> i take it you mean "authors" -- the plural -- not the possessive...
>
> >   I'll admit that Mr. Noring's suggestion of creating an  
> authoring tool
> >   that is word processor-like and WYSIWYGy, that would permit  
> author's
> >   to create e-books in a user-friendly environment, that would
> >   save documents in a powerful and useful format, and would
> >   convert to multiple existing e-book formats at the click of a  
> button,
> >   is a very intriguing programming project.
>
> that's what we're doing here, lee!
>
> the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is
> simple enough when the file-format is something as simple as .zml.
>
> further, z.m.l. is that "powerful and useful format" that you mention,
> precisely because it will do what you say you want to do, which is to
> create files across a variety of e-book formats.
>
> because once we morph the .zml file to .pdf and especially (x)html,
> it's a simple matter of routing that .html to the converter programs.
> (any e-book format that can't accept .html as input is a non-starter.)
>
>
> >   On the other hand, learning your approach to converting ZML
> >   to HTML is of virtually no interest to me at all.
>
> gee, lee, you're talking out of both sides of your mouth here.
>
> first you say "it's a very intriguing programming project",
> and then you say it is "of virtually no interest to me at all"...
>
> which is it lee?
>
> -bowerbird
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/c2172252/attachment-0001.html
From marcello at perathoner.de  Thu Oct 19 04:45:08 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 04:45:12 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <556.9857302.32682d39@aol.com>
References: <556.9857302.32682d39@aol.com>
Message-ID: <45376544.5000808@perathoner.de>

Bowerbird@aol.com wrote:

> page 2 validates!   woo-hoo!

Wow! only 426 pages to go on the way to the first Bowerbird "uality" ebook.


Then you can start fixing the 428 side-by-side view pages:


http://validator.w3.org/check?uri=http://www.greatamericannovel.com/myant/myantp001w.html

(I don't know why there are 428 of them, I would have thought 214 were
enough. You must know something I don't.)


Then you can upgrade to from HTML 4.01 transitional to XHTML 1.0 strict
(because OEB standard is based on XHTML).


Then you can restructure your XHTML so it does not just "validate" but
makes sense semantically. At the same time you can build your CSS
stylesheets.


Then you can add WAI accessibility features so screen readers actually
know what they are reading.


Then you'll be about half the way PGTEI is now.


I guess, hoping that you'll stop bothering this list until you have
fixed your processes, is out of the question?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From davedoty at hotmail.com  Thu Oct 19 04:42:56 2006
From: davedoty at hotmail.com (Dave Doty)
Date: Thu Oct 19 04:54:58 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
Message-ID: <BAY105-W7272651F636C4D5B83BA8DF0C0@phx.gbl>


Okay, how did Bowerbird get around my killfile?
_________________________________________________________________
All-in-one security and maintenance for your PC.? Get a free 90-day trial!
http://www.windowsonecare.com/purchase/trial.aspx?sc_cid=wl_wlmail
From marcello at perathoner.de  Thu Oct 19 04:57:19 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 04:57:25 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c3a.592837e.32682bc4@aol.com>
References: <c3a.592837e.32682bc4@aol.com>
Message-ID: <4537681F.8050801@perathoner.de>

Bowerbird@aol.com wrote:

> indeed, in fact, if memory serves me correctly, i left "that error" in
> on purpose, precisely because it caused the page not to validate,
> and i knew that would be upsetting to you technoid wieners here.

Poor, poor, Bowerbird. We have misunderestimated you.

It is uncanny how predictable you are. Every time somebody scores over
you by attacking your "evidence" frontally, you go nonlinear for hours.


> nah, the blind people will prefer the plain-text .zml version,
> since there isn't any crap in there to gunk up screen-readers.

And nothing to help screen readers ...


-- 
Marcello Perathoner
webmaster@gutenberg.org

From grythumn at gmail.com  Thu Oct 19 05:07:03 2006
From: grythumn at gmail.com (Robert Cicconetti)
Date: Thu Oct 19 05:35:18 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <BAY105-W7272651F636C4D5B83BA8DF0C0@phx.gbl>
References: <BAY105-W7272651F636C4D5B83BA8DF0C0@phx.gbl>
Message-ID: <15cfa2a50610190507u5776495fke06fc5a4c1d77ed3@mail.gmail.com>

People keep replying to him. I've been tempted to make my filter match on
body text too, but there have been a few occasions where the replies were
interesting.

R C

On 10/19/06, Dave Doty <davedoty@hotmail.com> wrote:
>
>
> Okay, how did Bowerbird get around my killfile?
> _________________________________________________________________
> All-in-one security and maintenance for your PC. Get a free 90-day trial!
>
> http://www.windowsonecare.com/purchase/trial.aspx?sc_cid=wl_wlmail_______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/fee6a313/attachment.html
From desrod at gnu-designs.com  Thu Oct 19 07:36:22 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 07:35:13 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <45376544.5000808@perathoner.de>
References: <556.9857302.32682d39@aol.com> <45376544.5000808@perathoner.de>
Message-ID: <Pine.LNX.4.64.0610191029000.16488@aphrodite.gnu-designs.com>


> Then you can upgrade to from HTML 4.01 transitional to XHTML 1.0 
> strict (because OEB standard is based on XHTML).

 	Careful there... For the Windows users, MSIE doesn't support 
proper XHTML _at all_ (but it does for the degraded XHTML-as-HTML 
doctype that 99% of the people who think they're using XHTML properly 
are declaring it as).

 	XHTML is properly served as "application/xhtml+xml", but most 
people end up sending it as "text/html", which doesn't really improve 
the situation.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From desrod at gnu-designs.com  Thu Oct 19 07:38:31 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 07:37:13 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <m57ej2hruc3r9b2fjnu4ago4vq89cg7s2i@4ax.com>
References: <c84.535212f.3267dd4b@aol.com>
	<m57ej2hruc3r9b2fjnu4ago4vq89cg7s2i@4ax.com>
Message-ID: <Pine.LNX.4.64.0610191037470.16488@aphrodite.gnu-designs.com>


> babelfish has been going for years, and I have used it for years. 
> The translations given are absolutely awful.  At best they are a 
> first pass for a human translator, or give you a general idea of 
> what the text is about.

 	translate.google.com is fairly accurate with many of the 
unpopular languages, and I've had good success with it over the last 
couple of years.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From Bowerbird at aol.com  Thu Oct 19 10:10:15 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 10:10:26 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <490.ba2c344.32690b77@aol.com>

marcello said:
>    Then you can start fixing the 428 side-by-side view pages:

boo-ya!   the fun will run into 2009!


>    (I don't know why there are 428 of them, 
>    I would have thought 214 were enough. 
>    You must know something I don't.)

i know that end-users want _simplicity_ in referencing them,
which means they might want to reference each pagespread
using its left-page reference _or_ its right-page reference...


>    Then you can upgrade to from HTML 4.01 transitional 
>    to XHTML 1.0 strict (because OEB standard is based on XHTML).

as soon as jon noring joins our little open-source project here,
i'm sure he'll be happy as a pig in shit to tell us how to do that...

speaking of which, i'm sure jon is _so_ proud of me now that i'm
engaging in an open-source project.   just what he always wanted!


>    Then you can restructure your XHTML so it does not just "validate" 
>    but makes sense semantically. 

again, jon noring will be our go-to guy on that.


>    At the same time you can build your CSS stylesheets.

ditto.   if it's a three-letter acronym, jon's got it covered!


>    Then you can add WAI accessibility features so screen readers 
>    actually know what they are reading.

hey, if you've got the "expert" that everyone is "inviting",
then it would be stupid not to make good use of him, eh?


>    Then you'll be about half the way PGTEI is now.

half-way!   woo-hoo!   so what they say about open-source is true!
the march toward excellence just seems to happen automagically!


>    I guess, hoping that you'll stop bothering this list until you have
>    fixed your processes, is out of the question?

i "fixed" page 3 so it "validates".   as soon as you report the "error"
on page 4, i'll put that on my list to "repair" tomorrow.   so much fun!

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/e7378a18/attachment.html
From Bowerbird at aol.com  Thu Oct 19 10:28:37 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 10:28:46 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <594.40e032fd.32690fc5@aol.com>

marcello said:
>    Every time somebody scores over you 
>    by attacking your "evidence" frontally, 
>    you go nonlinear for hours.

you really believe you've "scored over me"
with that lame validation crap, don't you?

well, keep humoring yourself, marcello...


>    you go nonlinear for hours.

nonlinear?   well, i _sleep_ on occasion, but...

***

meanwhile, other trolls have joined in now,
with nothing more to contribute than to tell
the hundreds of people subscribed here that
_they_ have me in their killfile.   bully for you.

but your attempt to get people to stop paying
attention by flooding their e-mailboxes with
nothingness is so transparent it's easy to reveal.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/e7a230d3/attachment.html
From desrod at gnu-designs.com  Thu Oct 19 10:28:47 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 10:30:32 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <490.ba2c344.32690b77@aol.com>
References: <490.ba2c344.32690b77@aol.com>
Message-ID: <1161278927.6501.10.camel@localhost.localdomain>

> boo-ya!  the fun will run into 2009!

> as soon as jon noring joins our little open-source project here,
> i'm sure he'll be happy as a pig in shit to tell us how to do that...

> speaking of which, i'm sure jon is _so_ proud of me now that i'm
> engaging in an open-source project.  just what he always wanted!

> again, jon noring will be our go-to guy on that.

> ditto.  if it's a three-letter acronym, jon's got it covered!

> hey, if you've got the "expert" that everyone is "inviting",
> then it would be stupid not to make good use of him, eh?

> half-way!  woo-hoo!  so what they say about open-source is true!
> the march toward excellence just seems to happen automagically!

> i "fixed" page 3 so it "validates".  as soon as you report the "error"
> on page 4, i'll put that on my list to "repair" tomorrow.  so much
> fun!

I have to wonder if anyone has done a serious psychological profile on
our fellow Bowerbird here, based on the various replies he's made over
the years. 


-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/d5613059/attachment.bin
From Bowerbird at aol.com  Thu Oct 19 10:44:20 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 10:44:48 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <51a.593388a4.32691374@aol.com>

i'm not sure what it was about my postings lately
that has drawn the trolls out of the caves where
they had been sequestered for some time, but
i sure wish they'd go back.   the silence was nice...

at any rate, let's talk about what all this boils down to.

when your e-book format is _complex_and_obtuse_,
like .xml or .tei, it's difficult to program tools for it,
and you need people with expertise to maintain it.
that is the sad reality faced by the technoids here...

when your file-format is _simple_, like my z.m.l.,
it is exceedingly easy to program tools for it, and
even a large library of files can be maintained by
an above-average 4th-grader.   makes sense, right?

it all boils down to costs and benefits, and their ratio.

and it doesn't matter how much flak my detractors
pitch at me, in the long run, it's _always_ going to 
boil down to costs and benefits, and their ratio...

so beware the technoids who want you to buy their
complex systems, and keep paying for them forever...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/8f253c28/attachment-0001.html
From desrod at gnu-designs.com  Thu Oct 19 10:57:14 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 10:58:25 2006
Subject: [gutvol-d] what it all boils down to
In-Reply-To: <51a.593388a4.32691374@aol.com>
References: <51a.593388a4.32691374@aol.com>
Message-ID: <1161280634.6501.18.camel@localhost.localdomain>

> when your e-book format is _complex_and_obtuse_,
> like .xml or .tei, it's difficult to program tools for it,
> and you need people with expertise to maintain it.
> that is the sad reality faced by the technoids here...

When your system is incompatible with standards and obtuse, you go
reinvent your own format, and have to write tools from scratch to
support it. It makes sense to make the format and the tools as simple as
possible, to speed delivery. 

When you work with standards and simple formats like XML and XML-based
derivitives, you can use the wealth of tools and other resources that
others have written instead of reinventing your own wheel, simply
because you refuse to acknowledge that others have done the work so you
don't have to. 

> it all boils down to costs and benefits, and their ratio.

I agree, which is why I personally use standards-compliant tools and
methodologies, centered around a unified, agreed-upon format which can
then be converted into any other format I wish. 

> and it doesn't matter how much flak my detractors
> pitch at me, in the long run, it's _always_ going to 
> boil down to costs and benefits, and their ratio...

Use what works for you, just don't presume that others have better ideas
than your own or have done it in a different way which works for
others. 

> so beware the technoids who want you to buy their
> complex systems, and keep paying for them forever... 

And beware of propritary, unaccepted formats which exist in a vacuum
without community or industry support, tools and resources to support
them. 


-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/8859814c/attachment.bin
From Bowerbird at aol.com  Thu Oct 19 11:37:07 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 11:37:13 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <c7c.99fa62.32691fd3@aol.com>

david said:
>    When your system is incompatible with standards and obtuse, 
>    you go reinvent your own format, and have to write tools from scratch 
>    to support it. 

people invent new approaches all the time.

the best of those get turned into "standards".
(at least sometimes anyway.   ok, _occasionally_.)

but nothing can turn back an idea whose time has come,
and the time for the idea of _simplicity_ is now upon us...

soon it will be as hard to sell _complexity_ as it is to sell a _c.d._,
and let us all observe a moment of silence now for tower records.


>    When you work with standards and simple formats 
>    like XML and XML-based derivitives, you can use 
>    the wealth of tools and other resources that
>    others have written

i would think that jon and lee would appreciate your pointers to
"the wealth of tools and other resources that others have written",
since they're undertaking a project to reinvent the wheel yet again.

heck, i think we'd _all_ appreciate those pointers, david...


>    I agree, which is why I personally use standards-compliant tools 
>    and methodologies, centered around a unified, agreed-upon format 
>    which can then be converted into any other format I wish.

that seems like a sensible position.

and i assume that it means that when the _simple_yet_fully_effective_
format is the one that is the "unified, agreed-upon format", that you
will use it.   of course you will, you're a practical man.   so will 
everyone.
when costs are low and benefits are high, the decision is a no-brainer.

in the meantime, someone has to create that simple-yet-fully-effective
format.   i'm one of those someone's, but i'm not the only one, not at all.
markdown, textile, wiki-formatting, crossmark, there's a bunch of 'em,
all growing out of the philosophy that people don't wanna do markup.


>    Use what works for you

gee, thanks for giving me your permission, david!           :+)


>    just don't presume that others have better ideas than your own

ok, i won't presume that others have better ideas than my own.

(wha?)

but i will be open to the possibility, if you don't mind, because
otherwise i would become too inflexible, and i _like_ flexibility.

besides, there have been too many cases in the past where
someone else had a better idea than mine.   (so i adopted it.)


>    or have done it in a different way which works for others.

one of us is confused here.   am i supposed to _not_ "presume" this?
because it seems pretty obvious to me that some people do things
"in a different way" than i do, and i _assume_ they are doing that
because it "works" for them -- or at least that they _think_ it does.
(which doesn't mean they're necessarily correct in that judgment.)


>    And beware of propritary, unaccepted formats which 
>    exist in a vacuum without community or industry support, 
>    tools and resources to support them.

i guess "proprietary" is such a dirty word to you that
you can't even bring yourself to spell it correctly, eh?
maybe even took it out of your spellcheck dictionary?

at any rate, i think a quick perusal of the "11 rules" of z.m.l.
will quash the all-too-silly notion that z.m.l. is "proprietary".
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt

but then again, maybe today's patent office _would_ give me
a patent on sterling ideas like "4 blank lines before a header"
or "indent lines of the poem however much you want them to
be indented, but use at least one space of indentation so we
know that it's a _block_ and that it shouldn't be re-wrapped"...

because those are some really earth-shaking ideas, right?
hello gallileo, the catholics will surely condemn me too...

yet would it not be sweet if i could demand a patent royalty
from anyone who ever used 4 blank lines in a row?   awesome!
i'd be rich!   rich, i tell you!   filthy bloody rich!   radical!         ;+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/3c8a1843/attachment.html
From marcello at perathoner.de  Thu Oct 19 12:01:10 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 12:01:14 2006
Subject: [gutvol-d] what it all boils down to
In-Reply-To: <c7c.99fa62.32691fd3@aol.com>
References: <c7c.99fa62.32691fd3@aol.com>
Message-ID: <4537CB76.9040200@perathoner.de>

Bowerbird@aol.com wrote:

> i guess "proprietary" is such a dirty word to you that
> you can't even bring yourself to spell it correctly, eh?

That's the second spelling flame by you today.


> and, of course, ad hominem arguments are always cheap shots.
> (Bowerbird, 01 Jan 2004)

Spelling flames are ad hominem arguments. Funny. Isn't it?


> hello gallileo, the catholics will surely condemn me too...
        ^^^^^^^^

Even funnier, when the spelling flamers go on, they usually knock
themselves out.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Thu Oct 19 12:08:08 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 12:08:12 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <1161278927.6501.10.camel@localhost.localdomain>
References: <490.ba2c344.32690b77@aol.com>
	<1161278927.6501.10.camel@localhost.localdomain>
Message-ID: <4537CD18.3070703@perathoner.de>

David A. Desrosiers wrote:

> I have to wonder if anyone has done a serious psychological profile on
> our fellow Bowerbird here, based on the various replies he's made over
> the years. 

Here, here! Me! Me!

  http://www.gnutenberg.de/bowerbird/


-- 
Marcello Perathoner
webmaster@gutenberg.org

From cannona at fireantproductions.com  Thu Oct 19 12:09:45 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Thu Oct 19 12:12:08 2006
Subject: [gutvol-d] what it all boils down to
References: <51a.593388a4.32691374@aol.com>
Message-ID: <006801c6f3b2$79666600$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bowerbird wrote:
>
> when your file-format is _simple_, like my z.m.l.,
> it is exceedingly easy to program tools for it, and
> even a large library of files can be maintained by
> an above-average 4th-grader.   makes sense, right?

Of course, the files will look like they were maintained by an above average
4th grader, because most 9-10 year olds, even if they are above average,
probably won't quite grasp the importance of accurately representing complex
tables, formulas, or accessibility.  Of course, that won't matter because
the format doesn't support them anyway.

> it all boils down to costs and benefits, and their ratio.

Exactly.  However, a very small cost over an even smaller bennifit gives you
a high ratio, which, in this context is undesirable.  I know you would argue
that the bennifits of ZML are high, but I would submit that if a tool
doesn't do the job that you need done, then it is next to worthless.

> so beware the technoids who want you to buy their
> complex systems, and keep paying for them forever...

Good advice.  I would only add, beware those who extole the virtues of their
system while ignoring, glossing over, or minimizing the importance of its
flaws.

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFN84JI7J99hVZuJcRAkVlAKDXGtTHFbKeNlYt8cJenhsEQnD4igCgkQAu
WN+uJoEHfhRVLoid4Gf2Mc8=
=XqVB
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Thu Oct 19 12:31:02 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 12:31:09 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <caf.10e8150.32692c76@aol.com>

marcello said:
>    Spelling flames are ad hominem arguments.

no they aren't.

if you "accused" someone of being a bad speller,
that might be ad hominem.   it would also be a
pretty lame accusation, since i can't see how it
would impact on the chain of their arguments...


>    Even funnier, when the spelling flamers go on, 
>    they usually knock themselves out.

26,000+ hits on "gallileo" at google, so i went with it.
of course, if i'd tried "galileo", i woulda got 26 _million_.

live and learn.

i also made a typo recently with "purse" for "pursue".

i usually just mention spelling errors in passing, on the way
to a bigger point, and usually only if they reveal something
that i think is interesting, like lee stumbling over "authors"
or david tripping up on "proprietary", funny things like that.

but hey, i'll give you full credit for a spell-catch, marcello...        :+)

believe me, spelling means a whole helluva lot more to me
than wc3 "validation", assuming my code works just fine...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/b2366f17/attachment.html
From Bowerbird at aol.com  Thu Oct 19 12:36:18 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 12:36:24 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <c26.6eeb901.32692db2@aol.com>

aaron said:
>   Of course, that won't matter because
>    the format doesn't support them anyway.

aaron, i suggest you do something _constructive_,
like make a wiki-page that lists all of the e-texts
from the p.g. library that z.m.l. "cannot handle"...

then instead of making these vague accusations,
you can actually _point_to_ some solid _evidence_.
that'll save you time, and be much more effective...

it will also be a lot more _fair_, though that doesn't
seem to be something that you care about very much,
because i'll be able to point out where you're wrong...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/18cd4fb4/attachment.html
From marcello at perathoner.de  Thu Oct 19 13:13:22 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 13:13:26 2006
Subject: [gutvol-d] [Fwd: [webgroup] Database server upgrade notice]
Message-ID: <4537DC62.7050706@perathoner.de>

The PG catalog resides on this server. The wiki will not be affected.


-------- Original Message --------
Subject: [webgroup] Database server upgrade notice
Date: Thu, 19 Oct 2006 15:56:53 -0400 (EDT)
From: Ken Chestnutt <ken@metalab.unc.edu>
To: webgroup@lists.ibiblio.org
CC: ibiblio-announce@lists.ibiblio.org

Dear ibiblio users,

   We would like to thank you for hosting your content with us.  We have
been steadily growing over the past few months in both size and variety of
content. It is because of this growth that we need to do some maintenance.

   We will be upgrading one of our database servers Sunday night.  We are
scheduling a three hour downtime, starting at 9:00 p.m. EDT on Sunday,
October 22.

   This downtime will not affect everyone.  It will only affect users who
have a database on the server mysql2.  If you do not have a database or if
your database is on mysql.ibiblio.org, you will remain unaffected by this
upgrade.

   If you have any questions, please send us email at "help@ibiblio.org".

Thanks
Ken Chestnutt
ibiblio.org

_______________________________________________
webgroup mailing list
webgroup@lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/webgroup


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Thu Oct 19 13:26:22 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Oct 19 13:26:26 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <594.40e032fd.32690fc5@aol.com>
References: <594.40e032fd.32690fc5@aol.com>
Message-ID: <4537DF6E.5070205@perathoner.de>

Bowerbird@aol.com wrote:

> marcello said:
>>    you go nonlinear for hours.
> 
> nonlinear?   well, i _sleep_ on occasion, but...

nonlinear: adj.

    [scientific computation]

    1. Behaving in an erratic and unpredictable fashion; unstable. When
used to describe the behavior of a machine or program, it suggests that
said machine or program is being forced to run far outside of design
specifications. This behavior may be induced by unreasonable inputs, or
may be triggered when a more mundane bug sends the computation far off
from its expected course.

    2. When describing the behavior of a person, suggests a tantrum or a
flame. ?When you talk to Bob, don't mention the drug problem or he'll go
nonlinear for hours.? In this context, go nonlinear connotes ?blow up
out of proportion? (proportion connotes linearity).

  ---- http://www.catb.org/~esr/jargon/html/N/nonlinear.html


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Thu Oct 19 13:39:27 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 13:39:46 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c66.2c6c517.32693c7f@aol.com>

marcello said:
>    This behavior may be induced by unreasonable inputs

well, there you have it!


>    When describing the behavior of a person, 
>    suggests a tantrum or a flame.

i'm confident the hundreds of lurkers here
can tell -- without any difficulties at all -- 
who is being rational, solid, and thoughtful,
and who is not.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/1203ee05/attachment.html
From Bowerbird at aol.com  Thu Oct 19 13:51:30 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 13:51:41 2006
Subject: [gutvol-d] leeward -- a first-draft open-source pass at "babelfish"
Message-ID: <c0a.735a8db.32693f52@aol.com>

i said:
>   the first part -- a wordprocessor-like, wysiwyg-y authoring-tool is
>    simple enough when the file-format is something as simple as .zml.

i know it can be infuriating when someone says "oh that's simple",
and hey, well maybe it isn't easy for you, for one reason or another.

and i certainly wouldn't want to be infuriating...

so lee, what i've done is to create "leeward", a nice little text-editor
that is a first-draft pass on the specifications you laid out for us...

no, this doesn't mean that i'm gonna start _programming_ on our
little open-source effort here, that's _your_ job, i'm the _designer_,
but i figure it can't hurt to do just a little bit to help you get started.

besides, i didn't even write this program, i just took the code for
a sample app that realbasic distributes freely with their compiler.

however, i did change the label on the "italics" toggle from "i" to "e"
-- for "emphasis", of course -- and likewise the label on the "bold"
toggle from "b" to "s" (for "strong") to reflect the x.m.l. philosophy.
i thought you'd appreciate the sensitivity of that nice little touch...

(since i really don't know how to _show_ "emphasis" other than by
using italics, and likewise bold for "strong", i just left that styling in.
i hope that's ok.   think of it as a headstart on the c.s.s.)

anyway, lee, i compiled versions for o.s.x., windows, and linux, and
i would be happy to e-mail you whichever ones you'd like to see...

i'll even upload them to the web if anyone else wants to see them...

to get the source-code, you can just download it from realbasic.
>    http://www.realsoftware.com
it's the "text-editor" sample code.   let me know if you can't find it.

yes, lee, i know you'll probably use some other language/compiler.
that's fine.   because this is just a little nudge to help you get going,
to show you how simple it can be.   and i'm sure that whatever tool
you choose to use to do your programming, you'll be able to find
some simple text-editor code to serve as the springboard for you...

anyway, lee, despite your seeming reluctance, i know that eventually
you will come around to our little project.   after all, it's open-source!
so how can you resist?

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/1ebd40f4/attachment-0001.html
From brett at dimetrodon.demon.co.uk  Thu Oct 19 14:23:51 2006
From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar)
Date: Thu Oct 19 14:25:06 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c66.2c6c517.32693c7f@aol.com>
References: <c66.2c6c517.32693c7f@aol.com>
Message-ID: <ZCoRSZFnz+NFFwEe@dimetrodon.demon.co.uk>

Bowerbird@aol.com writes
>marcello said:
>>?? This behavior may be induced by unreasonable inputs
>
>well, there you have it!
>
>
>>?? When describing the behavior of a person,
>>?? suggests a tantrum or a flame.
>
>i'm confident the hundreds of lurkers here
>can tell -- without any difficulties at all --
>who is being rational, solid, and thoughtful,
>and who is not.

As a lurker I would like to say I have developed the overwhelming 
impression that you are a pointless posturing idiot who will never 
develop a single piece of useful software or indeed ever contribute 
anything of value.

On zml as far as I can see it is a feature deficient mark up with only 
one, largely pointless, advantage over a very basic xml, namely the 
source code looks nicer when viewed in notepad or similar, at the cost 
of making the source code a lot harder to edit. It has a very limited 
feature set and it is very hard to extend that set if it turns out you 
need additional features. It also tends to mean that it is easy for a 
typo to result in code that validates but is incorrect, e.g. three 
carriage returns rather than four in zml is valid but incorrect while 
accidentally replacing "<h1> </h1>" with "<h2> </h2>" is much harder
-- 
Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm
Livejournal http://brett-dunbar.livejournal.com/
Brett Paul Dunbar
To email me, use reply-to address
From Bowerbird at aol.com  Thu Oct 19 14:53:35 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 14:53:46 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c0e.13123be.32694ddf@aol.com>

brett said:
>    I have developed the overwhelming impression that 
>    you are a pointless posturing idiot who will never
>    develop a single piece of useful software or 
>    indeed ever contribute anything of value.

if you have specific feedback about my software,
i'd love to hear it.   otherwise, i get the impression
that you haven't even looked at it.   am i incorrect?


>    On zml as far as I can see it is a feature deficient mark up 

again, if you mentioned features in which it is "deficient",
your rant would have less emotionality and more power...


>    with only one, largely pointless, advantage over 
>    a very basic xml, namely the source code looks nicer 
>    when viewed in notepad or similar

well, it _does_ look nicer in a text-editor, thanks for noticing.

but the real difference is that it is endowed with a number of
e-book capabilities when loaded into a zml-viewer-program.

even if you can get those same capabilities with a more complex
format, the question is why you'd pay the extra cost of complexity
for the same set of benefits.   i can't think of _any_ good reason...


>    at the cost of making the source code a lot harder to edit.

i don't see how you can even suggest that editing .zml would be
"a lot harder" than editing .xml.   that's ridiculous on the face of it.


>    It has a very limited feature set 

what are the limitations?


>    and it is very hard to extend that set 
>    if it turns out you need additional features.

no, i've found it's simple to extend the idea to get new features.

that's why i keep asking for suggestions, so if anyone comes up
with something that i think needs to be added, i can do it now...


>    It also tends to mean that it is easy for a typo to result in 
>    code that validates but is incorrect

first, the concept of "validation" doesn't apply to z.m.l.   sorry.

z.m.l. works the way it works.   if you don't get what you want,
then you need to rework your file until you get what you want.

fortunately, it's easy to "rework your file" to get what you want,
especially since there is such a small number of simple "rules"...


>    three carriage returns rather than four in zml is valid but incorrect

3 empty lines won't give you a _header_, so if that's what you want,
yes, you'll need to add another empty line.   but since the wysiwyg
display will show you -- clearly -- that you do not have a header,
it'll be obvious to you that you need to put in more empty lines...

furthermore, the "contents" listbox will let you bump up (or down)
the _level_ of each header, in the style of "outliner" applications, and
display them _as_ an outline, so i think you need a better example...

besides, for the people who need to see a more _visible_ form
of header-markup, i _might_ also support _atx_ header markup:

+++ this would be a level-three header.

++ this would be a level-two header.

+ this would be a level-one header.

***

i'm also considering the se-text style of headers:

this is a level-one header
==========================

and this is a level-two header
------------------------------

***

i'm still undecided on those, since it seems to me that
the number-of-blank-lines method is easy enough to grok,
and it represents the "invisible" ideal of zen markup best,
but when i do decide, i promise to announce it here _first_.

***

so anyway, brett, thanks for your feedback.   
but more precision, next time, if you please.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/f3e4ddad/attachment.html
From cannona at fireantproductions.com  Thu Oct 19 15:46:12 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Thu Oct 19 15:47:31 2006
Subject: [gutvol-d] what it all boils down to
References: <c26.6eeb901.32692db2@aol.com>
Message-ID: <026a01c6f3d0$8f1a4b10$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bowerbird Wrote:

> aaron, i suggest you do something _constructive_,
> like make a wiki-page that lists all of the e-texts
> from the p.g. library that z.m.l. "cannot handle"...
>
> then instead of making these vague accusations,
> you can actually _point_to_ some solid _evidence_.
> that'll save you time, and be much more effective...

Do things that you've said count for evidence in your opinion?

"since google will be scanning 10-40 million books, there are plenty
of non-math texts for me to work on before i do the math books..."

">   Sure we do. We use TeX (or pseudo-TeX fragments).

and that's why that's what i'll probably do as well,
when the time comes that i feel that it's necessary,
because that's my modus operandi, to utilize the
existing conventions, to best leverage current work."

"but for now, i'm not at all worried about this 'problem'."

"ultimately, for music, i will probably do exactly
what d.p. has done, and use lilypond or finale,
and route that file to either an external player
or one that i have embedded in my viewer-app.
unlike music-markup-language, lilypond shares
my core philosophy of simplicity and elegance...

"i'll follow the same approach for math equations,
routing them to an equation editor that is either
(a) an external app, or (b) embedded in my viewer.
i'd guess it will probably be tex-based rather than
math-markup-language, as tex is widely preferred,
and expressible in utf-8.  (math-markup-language
is also expressible in utf-8, but it's also got all that
angle-bracket gunk in it, which i'm badly allergic to.)"

"meanwhile, for the 99.7% of the project gutenberg library which
currently has no need _at_all_ (let alone any _compelling_ need)
for math equations, i don't have to worry about them, thank you."

"  my own system will eventually be able to handle quite complex tables,
when i find the need to develop it that far."

"there aren't a whole lot of tables in the e-texts -- we're talking
literature, not spreadsheets -- but your system should handle tables anyway;
not really big and hairy ones, just simple ones"


"my own system will eventually be able to handle quite complex tables,
when i find the need to develop it that far.  and if you'd like some proof,
then hand me a list of 100 e-texts that use tables, and i will tackle them
first when the time for "attacking tables" comes up big on my agenda...
(and leave out the spalding baseball guides, i already know about them.)"

So, either you are really confused, or your format doesn't support complex
tables and mathematics.  I am curious though; if you already know that your
format doesn't support these things, why do you need us to find the ebooks
that contain them for you?  Is it that you are unwilling to do the work your
self?

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOACCI7J99hVZuJcRAt9tAKCYYAxarzSpmMOPFpBt1VTHNGl+JACg9mNR
CY3Yy+m3G8udKoBy623InT8=
=V2un
-----END PGP SIGNATURE-----

From brett at dimetrodon.demon.co.uk  Thu Oct 19 15:53:33 2006
From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar)
Date: Thu Oct 19 15:54:49 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c0e.13123be.32694ddf@aol.com>
References: <c0e.13123be.32694ddf@aol.com>
Message-ID: <Jcp1FJHtHAOFFw$d@dimetrodon.demon.co.uk>

Bowerbird@aol.com writes
>brett said:
>>?? I have developed the overwhelming impression that
>>?? you are a pointless posturing idiot who will never
>>?? develop a single piece of useful software or
>>?? indeed ever contribute anything of value.
>
>if you have specific feedback about my software,
>i'd love to hear it.? otherwise, i get the impression
>that you haven't even looked at it.? am i incorrect?
>
>
>>?? On zml as far as I can see it is a feature deficient mark up
>
>again, if you mentioned features in which it is "deficient",
>your rant would have less emotionality and more power...

Mathematical formulae, exotic typography, support for features
impossible in print.

>
>>?? with only one, largely pointless, advantage over
>>?? a very basic xml, namely the source code looks nicer
>>?? when viewed in notepad or similar
>
>well, it _does_ look nicer in a text-editor, thanks for noticing.
>
>but the real difference is that it is endowed with a number of
>e-book capabilities when loaded into a zml-viewer-program.

All of which can be done in a basic xml just as easily. As a format that
has the major advantage of being designed to allow the addition of extra
features to the format if they become necessary.

>even if you can get those same capabilities with a more complex
>format, the question is why you'd pay the extra cost of complexity
>for the same set of benefits.? i can't think of _any_ good reason...

Future proofing, you can simply not use a feature that is present that
you don't need, while you have real problems if you need a feature that
isn't there.


>>?? at the cost of making the source code a lot harder to edit.
>
>i don't see how you can even suggest that editing .zml would be
>"a lot harder" than editing .xml.? that's ridiculous on the face of it.

The mark-up is largely invisible, being white space, and therefore hard
to see, some of it relies on counting carriage returns, which is hard to
do by eye. While adding a few angle brackets to a piece of text is
pretty simple.

>
>>?? It has a very limited feature set

>what are the limitations?

It only represents a fairly limited range of formatting, for example
there is no clear way of representing tables complex formulae or various
forms of exotic formatting, e.g. in _Jingo_ Terry Pratchett uses an
Arabic-looking font to represent Klatchian he has one Captain Carrot use
some words mostly in that font but with the h in the normal font, as
Carrot is mispronouncing the h. I very much doubt zml could handle
that, nonetheless it is required for that book.

>
>>?? and it is very hard to extend that set
>>?? if it turns out you need additional features.
>
>no, i've found it's simple to extend the idea to get new features.
>that's why i keep asking for suggestions, so if anyone comes up
>with something that i think needs to be added, i can do it now...

That isn't the point, the point is if a feature turns out to be needed
at some point in the future, and has been omitted from the original
spec, xml can be extended to include it in a straightforward manner zml
can't.

>
>>?? It also tends to mean that it is easy for a typo to result in
>>?? code that validates but is incorrect
>
>first, the concept of "validation" doesn't apply to z.m.l.? sorry.

The point of a validator is to attempt to catch things like a part
header being accidentally shown as a chapter header automatically, this
makes it easier to find and fix errors.

>z.m.l. works the way it works.? if you don't get what you want,
>then you need to rework your file until you get what you want.
>
>fortunately, it's easy to "rework your file" to get what you want,
>especially since there is such a small number of simple "rules"...
>
>
>>?? three carriage returns rather than four in zml is valid but
>incorrect
>
>3 empty lines won't give you a _header_, so if that's what you want,
>yes, you'll need to add another empty line.? but since the wysiwyg
>display will show you -- clearly -- that you do not have a header,
>it'll be obvious to you that you need to put in more empty lines...

If you are using a special application to edit it anyway there is no
reason to obscure the mark-up in the source code. With xml you can also
edit in something like notepad rather more easily.

That is also requiring you to spot manually an error that an xml
validator could catch automatically.

>
>furthermore, the "contents" listbox will let you bump up (or down) the
>_level_ of each header, in the style of "outliner" applications, and
>display them _as_ an outline, so i think you need a better example...

Tom's ebookreader <http://pws.prserv.net/Fellner/Software/index.htm> has
long done something like that.

>
>besides, for the people who need to see a more _visible_ form
>of header-markup, i _might_ also support _atx_ header markup:
>
>+++ this would be a level-three header.
>
>++ this would be a level-two header.
>+ this would be a level-one header.


I don't really see this as having any detectable advantage over the xml
style angle brackets. They have the disadvantage that a typo will still
be valid but incorrect, while a similar typo is liable to produce
invalid xml. The angle bracket system also allows the easy addition of
further format types at a later date.


>***
>i'm also considering the se-text style of headers:
>this is a level-one header
>==========================
>
>and this is a level-two header
>------------------------------
>
>***
>
>i'm still undecided on those, since it seems to me that
>the number-of-blank-lines method is easy enough to grok,
>and it represents the "invisible" ideal of zen markup best,
>but when i do decide, i promise to announce it here _first_.

Blank lines are easy to understand, they are hard to edit, which is the
problem.
-- 
Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm
Livejournal http://brett-dunbar.livejournal.com/
Brett Paul Dunbar
To email me, use reply-to address
From Bowerbird at aol.com  Thu Oct 19 16:12:48 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 16:12:56 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <42c.8147b3d.32696070@aol.com>

aaron said:
>    So, either you are really confused, 
>    or your format doesn't support complex tables and mathematics.

i'm not confused in the slightest, aaron.

z.m.l. supports math equations via graphics.

z.m.l. supports "complex tables" by advising that
they be broken down to "simple tables" instead...

this kind of support is sufficient for the time being.

however, if you were to show me some actual e-texts
that are currently in the p.g. library for which you would
like to see more extensive support, i'd be happy to look
at those e-texts and tell you what i would consider doing.

if, on the other hand, you just want to make some vague claims
about what z.m.l. is "incapable" of handling, then i won't bother
to even respond to your e-mails any more, for obvious reasons...


>    I am curious though; if you already know that your format 
>    doesn't support these things, why do you need us to find 
>    the ebooks that contain them for you?

i am pretty familiar with what's in the library...

i think the methodology i have in place already
is more than sufficient for handling what's there.

but i'm more than willing to entertain the possibility
that there are e-texts that it would be good for me to
take a close hard look at, so if you have any pointers
in that regard, i would be most happy to receive them.

otherwise, i'll assume you can't find anything that would
be too difficult to handle, and i'll just continue on with my
present plans for my survey of the total library as scheduled.


>    Is it that you are unwilling to do the work your self?

no.   as i said, i'll get around to the whole library eventually.

in the meantime, i'm just calling your bluff.

so you can either show your cards, or keep on stalling.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/eaac7e20/attachment-0001.html
From Bowerbird at aol.com  Thu Oct 19 16:27:41 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 16:27:49 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c16.76d0fb6.326963ed@aol.com>

brett said:
>    Mathematical formulae

graphics.   [heavy sigh]


>    exotic typography

graphics.


>    support for features impossible in print.

such as?


>   All of which can be done in a basic xml just as easily.

"just as easily" is a decision that end-users will make.


>   Future proofing

yeah, right.   we've had, what, 4 versions of .html
in the last decade, and even that is now outdated?

x.m.l. advocates seem to have a _very_ short memory.
that's not what i see as being good for "future proofing".


>   some of it relies on counting carriage returns, 
>    which is hard to do by eye.

that's why you have the computer do it for you,
and show the results in a starkly obvious way...


>   I very much doubt zml could handle that, 
>    nonetheless it is required for that book.

let me know when that book hits the p.g. library,
because i'm quite sure i'll be able to handle it then.

or alternately, show me the x.m.l. "solution" for it.


>   That isn't the point, the point is 
>    if a feature turns out to be needed at some point in the future, 
>    and has been omitted from the original spec, 
>    xml can be extended to include it in a straightforward manner 
>    zml can't.

of course z.m.l. can add new features when needed.


>   header being accidentally shown as a chapter header automatically, 
>    this makes it easier to find and fix errors.

i build that capability into the authoring-tool, where it's most useful.


>   If you are using a special application to edit it anyway 
>    there is no reason to obscure the mark-up in the source code. 

except because obfuscatory mark-up is ugly, that's all.
not to mention obfuscatory.   and that it gets in the way.


>    With xml you can also edit in something like notepad rather more easily.

here again you say something that's ridiculous on its face.

editing x.m.l. in notepad.   yeah, right.   that's the solution!
i'm perplexed why lee and jon noring never thought of that!


>    Tom's ebookreader <http://pws.prserv.net/Fellner/Software/index.htm> 
>    has long done something like that.

very few things in z.m.l. are unprecedented.   that's the point of it.


>   I don't really see this as having any detectable advantage over the xml

then i suggest you stick with x.m.l., brett.   believe me, that's o.k. with 
me!


>   Blank lines are easy to understand, they are hard to edit, 
>    which is the problem.

i don't know about _your_ keyboard, brett, but mine has this
big key that is clearly labeled "return" in a prominent place...

and there's another one -- labeled "enter" -- in the corner...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/431399a4/attachment.html
From desrod at gnu-designs.com  Thu Oct 19 16:35:56 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 16:36:36 2006
Subject: [gutvol-d] leeward -- a first-draft open-source pass at
	"babelfish"
In-Reply-To: <c0a.735a8db.32693f52@aol.com>
References: <c0a.735a8db.32693f52@aol.com>
Message-ID: <1161300956.6501.24.camel@localhost.localdomain>

On Thu, 2006-10-19 at 16:51 -0400, Bowerbird@aol.com wrote:
> anyway, lee, despite your seeming reluctance, i know that eventually
> you will come around to our little project.  after all, it's
> open-source! so how can you resist? 

If its truly "open source", where is the code that YOU changed? Pointing
back to the upstream source is not sufficient to comply with most OSI
approved licenses. 


-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/9dbbcc90/attachment.bin
From desrod at gnu-designs.com  Thu Oct 19 16:36:28 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Thu Oct 19 16:37:36 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c66.2c6c517.32693c7f@aol.com>
References: <c66.2c6c517.32693c7f@aol.com>
Message-ID: <1161300988.6501.26.camel@localhost.localdomain>

On Thu, 2006-10-19 at 16:39 -0400, Bowerbird@aol.com wrote:
> i'm confident the hundreds of lurkers here
> can tell -- without any difficulties at all -- 
> who is being rational, solid, and thoughtful,
> and who is not. 

Of that, I have no doubt. ;) 

-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/324b8063/attachment.bin
From cannona at fireantproductions.com  Thu Oct 19 16:42:46 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Thu Oct 19 16:42:48 2006
Subject: [gutvol-d] what it all boils down to
References: <42c.8147b3d.32696070@aol.com>
Message-ID: <006501c6f3d8$493b5af0$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bowerbird wrote:

> i'm not confused in the slightest, aaron.
>
> z.m.l. supports math equations via graphics.
>
> z.m.l. supports "complex tables" by advising that
> they be broken down to "simple tables" instead...
>
> this kind of support is sufficient for the time being.
>
> however, if you were to show me some actual e-texts
> that are currently in the p.g. library for which you would
> like to see more extensive support, i'd be happy to look
> at those e-texts and tell you what i would consider doing.

I stand corrected.  Under your definition of "support", it appears that you
can support any book in the PG library because after all, any book with
complex formatting or using special notation can simply be displayed via
graphics.  If the pages are too wide to fit in a single image, that's ok,
because ZML "supports" them by suggesting that they be broken down into
simpler images.

So, as a result of our new definition of support, I am forced to state
unequivocally that ZML can support anything in the PG library, but it does
so very very poorly.

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOA15I7J99hVZuJcRAnUpAKC9jO7kk26d/WqVWAVLl2YIvrLJJgCgl8BO
NI4wOvZL7Bq598J4ut7vnwc=
=txfq
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Thu Oct 19 17:41:18 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 17:41:25 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <c5f.48d0b91.3269752e@aol.com>

aaron said:
>    but it does so very very poorly.

and -- again -- if you point me to the books
where it does "very very poorly", maybe then
i'll do something to "improve" its performance.

so are you going to just throw in your cards,
or are you gonna show us what's in your hand?

because so far you haven't proven jack...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/3cf3a28f/attachment.html
From Bowerbird at aol.com  Thu Oct 19 17:45:36 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 17:45:42 2006
Subject: [gutvol-d] leeward -- a first-draft open-source pass at
	"babelfish"
Message-ID: <c6d.218525e.32697630@aol.com>

david-

i changed the labels on those two buttons.   that's it.

if people are interested in this text-editor program,
i'll rewrite it from scratch, clean-room style, so its
lineage will be spot-free.   and you can be in charge
of making sure it "complies" with all of the legalities.

but since there are few (if any) realbasic programmers
here on this list, i'll assume that won't be necessary...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/bc5e2233/attachment.html
From Bowerbird at aol.com  Thu Oct 19 17:48:57 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Oct 19 17:49:09 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c42.60d9f4a.326976f9@aol.com>

david said:
>    Of that, I have no doubt. ;)

at last we have an island of agreement!       :+)

of course, as it should be clear to everyone,
the issues won't be decided on this listserve.

they'll be decided by users out in the real world,
making decisions based on costs and benefits...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061019/bfdb7342/attachment.html
From cannona at fireantproductions.com  Thu Oct 19 18:41:32 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Thu Oct 19 18:41:38 2006
Subject: [gutvol-d] what it all boils down to
References: <c5f.48d0b91.3269752e@aol.com>
Message-ID: <00ed01c6f3e8$e2f0c530$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Perhaps an analogy will help you understand:

Person a:
I just designed this great new shark bite suit that I think is just swell.

Person b:
Oh really?

Person a:
Yep.  The beauty of the design is that its so simple, anyone can build it,
even an above average 4th grader.  Once people realize how utterly cool and
effective my design is, everyone will want one.

Person B:
Can I have one so I can take a look?

Person a:
Sorry, no.  But I will tell you how to build one. There are just thirteen
simple rules.

Person b, after reviewing the rules:
I can think of several situations where this suit won't be able to handle a
shark attack.

Person a:
that's not true!  My suit offers shark attack protection that is strong,
inclusive, and exhaustive.  I refuse to acknowledge any problems unless you
give me an example.

Person b:
Well, for one thing it's made of plastic wrap and rubber bands, so I think
the teeth of the shark are going to just cut right through it when he tries
to bite you.

Person a:
You're wrong.  My shark suit supports shark bites by advising that the user
swim really really fast.

Person b:
That's ridiculous.  Your supposed support for shark bites is no support at
all.  If it is, it's very very poor support.

Person a:
Show me where it has offered very very poor support for shark bites and
maybe then i'll do something to "improve" its performance.

so are you going to just throw in your cards, or are you gonna show us your
wounds?

Person b:
If you can't intuitively understand how plastic wrap and swimming faster is
not sufficient protection against shark teeth, then my showing you an
example of it not working isn't going to help you.


Get it?  If you can't intuitively understand that breaking tables into
smaller tables or displaying equations as graphics is really no support at
all, then my showing you an example of it not working isn't going to help
you.

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: <Bowerbird@aol.com>
To: <gutvol-d@lists.pglaf.org>; <Bowerbird@aol.com>
Sent: Thursday, October 19, 2006 7:41 PM
Subject: re: [gutvol-d] what it all boils down to


> aaron said:
>>    but it does so very very poorly.
>
> and -- again -- if you point me to the books
> where it does "very very poorly", maybe then
> i'll do something to "improve" its performance.
>
> so are you going to just throw in your cards,
> or are you gonna show us what's in your hand?
>
> because so far you haven't proven jack...
>
> -bowerbird
>


- --------------------------------------------------------------------------------


> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOClTI7J99hVZuJcRAjaNAKC4GSyWe6WuYDswhh0M1HlA8o2JMwCfZc8z
h2mrB5qg9Z31vQElAFJuOq8=
=m4xx
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Fri Oct 20 00:54:17 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 00:54:26 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <cc3.1347d28.3269daa9@aol.com>

when i "call" you, 
you're supposed to
show me your cards,
not tell me a story
about a shark...

i think you are now
_firmly_ on the record
that z.m.l. won't work...

thanks for putting yourself
_firmly_ on the record, aaron.
time will tell...   yes, time will tell...

as it is, you exhausted my patience
-- which is a difficult thing to do! --
so i won't likely be responding to you
again any time soon, just so you know.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f69ef366/attachment.html
From Bowerbird at aol.com  Fri Oct 20 02:03:35 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 02:03:46 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
Message-ID: <4bf.2aa73da.3269eae7@aol.com>

here's the perl version for babelfish assignment 01,
which was to read a file in and spit it to a webpage
with [pre], a simple task accomplished in 11 lines...

next time we'll learn about the "split" command,
and break out the text for each individual page...

and if someone would tell me how to do a "split" on
a sequence of multiple line-endings, that'd be great.

i assumed it would be something like this:
>    @thesections=split('\n\n\n\n\n',$thebook);

but that doesn't appear to be working for me.
so any help for this beginning perl script-kiddie
would be greatly appreciated, thank you much,
isn't open-source swell, all praise collaboration.

-bowerbird
----------------------------------------------------------------------

#!/usr/bin/perl
$filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
open (inf,"$filename") or print "that file was not available...<p>\n";
read (inf,$thebook,2222222); close inf;
print "content-type: text/html\n\n";
print '<!doctype html public "-//w3c//dtd html 4.01 transitional//en">';
print '<html lang="en"><head>';
print '<meta http-equiv="content-type" content="text/html; charset=us-ascii">
';
print "<title>my antonia!";
print "</title></head><body><pre>";
print $thebook;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f178629e/attachment.html
From schultzk at uni-trier.de  Fri Oct 20 02:15:51 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Fri Oct 20 02:15:57 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: <C61A2D2B-AC55-48D3-9325-1E83C2FB4861@uni-trier.de>

Hi There,

	Do to its nature multi-line parsing or splitting is not quite that easy
	in perl. If you are starting out with perl I would suggest getting  
the Book
	"Perl for Dummies". A great introduction and plenty of examples. You  
will
	find your answer there!!

	split is nice. But you want to be doing parsing which is a art in  
its own
	right. I am sure you are up to it.

	Just for the fun of it your script is incomplete.

		reagards
			Keith.

Am 20.10.2006 um 11:03 schrieb Bowerbird@aol.com:

> here's the perl version for babelfish assignment 01,
> which was to read a file in and spit it to a webpage
> with [pre], a simple task accomplished in 11 lines...
>
> next time we'll learn about the "split" command,
> and break out the text for each individual page...
>
> and if someone would tell me how to do a "split" on
> a sequence of multiple line-endings, that'd be great.
>
> i assumed it would be something like this:
> >   @thesections=split('\n\n\n\n\n',$thebook);
>
> but that doesn't appear to be working for me.
> so any help for this beginning perl script-kiddie
> would be greatly appreciated, thank you much,
> isn't open-source swell, all praise collaboration.
>
> -bowerbird
> ----------------------------------------------------------------------
>
> #!/usr/bin/perl
> $filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
> open (inf,"$filename") or print "that file was not available...<p>\n";
> read (inf,$thebook,2222222); close inf;
> print "content-type: text/html\n\n";
> print '<!doctype html public "-//w3c//dtd html 4.01 transitional// 
> en">';
> print '<html lang="en"><head>';
> print '<meta http-equiv="content-type" content="text/html;  
> charset=us-ascii">';
> print "<title>my antonia!";
> print "</title></head><body><pre>";
> print $thebook;
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/ca153b6a/attachment.html
From marcello at perathoner.de  Fri Oct 20 03:10:16 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 03:10:20 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: <4538A088.2080307@perathoner.de>

Bowerbird@aol.com wrote:

> #!/usr/bin/perl
> $filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
> open (inf,"$filename") or print "that file was not available...<p>\n";
> read (inf,$thebook,2222222); close inf;
> print "content-type: text/html\n\n";
> print '<!doctype html public "-//w3c//dtd html 4.01 transitional//en">';
> print '<html lang="en"><head>';
> print '<meta http-equiv="content-type" content="text/html; charset=us-ascii">
> ';
> print "<title>my antonia!";
> print "</title></head><body><pre>";
> print $thebook;

BUAHAHAHAHAHA !

Ever considered upgrading your 4th grader ?

Or, at least, don't let him/her write your code.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From brett at dimetrodon.demon.co.uk  Fri Oct 20 03:44:32 2006
From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar)
Date: Fri Oct 20 03:46:43 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c16.76d0fb6.326963ed@aol.com>
References: <c16.76d0fb6.326963ed@aol.com>
Message-ID: <nFGuzmLQiKOFFw60@dimetrodon.demon.co.uk>

Bowerbird@aol.com writes
>brett said:
>>?? Mathematical formulae
>
>graphics.? [heavy sigh]
>
A very, very stupid approach, xml has support for various methods of 
doing the job properly.

>>?? exotic typography
>graphics.

A very, very stupid approach, xml has support for various methods of 
doing the job properly and additional support can be added.

>
>>?? support for features impossible in print.
>such as?

Interactive tables, with sorting, functioning calculations, currency 
conversions, dynamically altering perspective on 3D graphs, clicking on 
a diagram to take you to the section on that component &c.

>
>>?? All of which can be done in a basic xml just as easily.
>
>"just as easily" is a decision that end-users will make.
>
>
>>?? Future proofing
>
>yeah, right.? we've had, what, 4 versions of .html
>in the last decade, and even that is now outdated?

Each version has been a superset of the previous one, stuff has been 
added over time as the fundamental design of the format makes this 
fairly easy.

>x.m.l. advocates seem to have a _very_ short memory.
>that's not what i see as being good for "future proofing".
>

That is what makes the format future proof. If a feature turns out to be 
needed then it can be added later. As has happened several times with 
html, people came up with ideas the earlier designers hadn't thought of.

You seem to have the arrogance to believe that you can think of every 
needed feature at the outset, xml is designed to allow the 
straightforward addition of new features that never occurred to the 
original designers.

>>?? some of it relies on counting carriage returns,
>>?? which is hard to do by eye.
>
>that's why you have the computer do it for you,
>and show the results in a starkly obvious way...
>
>

If you aren't doing it by hand what possible advantage does using an 
editor producing zml have over one producing basic xml?

Why is four carriage returns easier than <h1> </h1>?

>>?? I very much doubt zml could handle that,
>>?? nonetheless it is required for that book.
>
>let me know when that book hits the p.g. library,
>because i'm quite sure i'll be able to handle it then.
>

I want to know can you plan to handle having a single letter in a word 
in a different font than the rest of the word, xml can, I am certain 
that zml cannot.

>or alternately, show me the x.m.l. "solution" for it.

Something like 
"<font-style=arabic>watc</font-style=arabic>h<font-style=arabic>man</font
style=arabic>".

>
>>?? That isn't the point, the point is
>>?? if a feature turns out to be needed at some point in the future,
>>?? and has been omitted from the original spec,
>>?? xml can be extended to include it in a straightforward manner
>>?? zml can't.
>
>of course z.m.l. can add new features when needed.
>
>
There is no way of indicating that the following is mark-up and then 
allowing an arbitary string, in xml anything in angle brackets is markup 
this makes adding new types of mark-up simple.

>>?? header being accidentally shown as a chapter header automatically,
>>?? this makes it easier to find and fix errors.
>
>i build that capability into the authoring-tool, where it's most
>useful.
>
>
>>?? If you are using a special application to edit it anyway
>>?? there is no reason to obscure the mark-up in the source code.
>
>except because obfuscatory mark-up is ugly, that's all.
>not to mention obfuscatory.? and that it gets in the way.

If properly written it is not obfucatory, the mark-up in simple cases is 
obvious as it is simple contained within angle brackets, which clearly 
distinguishes the mark up from the text. The advantage of xml is it can 
also do the complicated stuff that zml cannot, that looks complicated 
because it is complicated.

>>?? With xml you can also edit in something like notepad rather more
>easily.
>
>here again you say something that's ridiculous on its face.
>
>editing x.m.l. in notepad.? yeah, right.? that's the solution!
>i'm perplexed why lee and jon noring never thought of that!

If editing simple texts i.e. the kind that zml can handle there aren't 
actually a lot of tags to keep track of, italics, bold, underline a few 
layers of headers, pictures and footnotes all of which can be dealt with 
by very basic xml.

>>?? Tom's ebookreader
><http://pws.prserv.net/Fellner/Software/index.htm>
>>?? has long done something like that.
>
>very few things in z.m.l. are unprecedented.? that's the point of it.
>
>
>>?? I don't really see this as having any detectable advantage over the 
>xml
>
>then i suggest you stick with x.m.l., brett.? believe me, that's o.k.
>with me!
>
>
>>?? Blank lines are easy to understand, they are hard to edit,
>>?? which is the problem.
>
>i don't know about _your_ keyboard, brett, but mine has this
>big key that is clearly labeled "return" in a prominent place...
>and there's another one -- labeled "enter" -- in the corner...
>

The problem is that it is hard to see the mark-up and harder to find the 
errors, as they rarely break validation. Four CRs rather than the CRs 
would still validate while <h33> </h3> would be an error that a 
validator would catch.
-- 
Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm
Livejournal http://brett-dunbar.livejournal.com/
Brett Paul Dunbar
To email me, use reply-to address
From marcello at perathoner.de  Fri Oct 20 03:54:46 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 03:54:51 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: <4538AAF6.9040809@perathoner.de>

Bowerbird@aol.com wrote:

> #!/usr/bin/perl
> $filename="/home2/yoursiteinfo/public_html/myant/myant.zml";
> open (inf,"$filename") or print "that file was not available...<p>\n";
> read (inf,$thebook,2222222); close inf;
> print "content-type: text/html\n\n";
> print '<!doctype html public "-//w3c//dtd html 4.01 transitional//en">';
> print '<html lang="en"><head>';
> print '<meta http-equiv="content-type" content="text/html; charset=us-ascii">
> ';
> print "<title>my antonia!";
> print "</title></head><body><pre>";
> print $thebook;

Show this to your 4th grader, so he/she won't stay a 4th grader forever.

Oh! And this actually takes the file name from the command line instead
of hardcoding it into the program. How to extract the title out of the
file is left as an exercise for the 4th grader.


#!/usr/bin/perl

# slurp whole file, mem is cheap
undef $/;
$text = <>;

print <<HERE;
<!DOCTYPE html public "-//w3c//dtd html 4.01 transitional//en">
<html lang="en">
  <head>
    <meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
    <title>my antonia!</title>
  </head>
  <body>
    <pre>$text</pre>
  </body>
</html>
HERE


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Fri Oct 20 04:22:51 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 04:22:55 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4bf.2aa73da.3269eae7@aol.com>
References: <4bf.2aa73da.3269eae7@aol.com>
Message-ID: <4538B18B.3010000@perathoner.de>

Bowerbird@aol.com wrote:

> and if someone would tell me how to do a "split" on
> a sequence of multiple line-endings, that'd be great.

Why don't you treat yourself to "Perl for Trolls" for halloween?


$ man perlre

       ...

       Matching operations can have various modifiers.
       Modifiers that relate to the interpretation of the
       regular expression inside are listed below.

       ...

       s   Treat string as single line.  That is, change
           "." to match any character whatsoever, even a
           newline, which normally it would not match.

       ...


#!/usr/bin/perl

undef $/;           # slurp whole file, mem is cheap
$text = <>;
$text =~ s/\r//g;   # fix brain-dead M$-DOS and Mac line endings

@chapters = split (/\n{5}/s, $text);   # 5 or whatever

for (@chapters) {
    # do something with chapters, like printing
    print "CHAPTER:\n\n$_\n\n";
}


-- 
Marcello Perathoner
webmaster@gutenberg.org

From bill at williamtozier.com  Fri Oct 20 05:06:28 2006
From: bill at williamtozier.com (William Tozier)
Date: Fri Oct 20 05:06:40 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <1161300988.6501.26.camel@localhost.localdomain>
References: <c66.2c6c517.32693c7f@aol.com>
	<1161300988.6501.26.camel@localhost.localdomain>
Message-ID: <8B510152-ED4D-4EA0-9B77-1CAB179464A9@williamtozier.com>


On Oct 19, 2006, at 7:36 PM, David A. Desrosiers wrote:

> On Thu, 2006-10-19 at 16:39 -0400, Bowerbird@aol.com wrote:
>> i'm confident the hundreds of lurkers here
>> can tell -- without any difficulties at all --
>> who is being rational, solid, and thoughtful,
>> and who is not.
>
> Of that, I have no doubt. ;)

As somebody committed to trying to promote and expand the  
collaborative effort underway at DP and PG, I disagree. Traffic on  
this list, which ideally could be a useful resource for production  
and improvement of the community and workflow, has degenerated to  
bickering over trivialities.

The usefulness of fully open "free" social systems is easily  
undermined, and this is prime good example. Anybody happening across  
this whole fiasco, or as far as I can see *any* contribution by  
bowerbird, would be quick to dismiss the entire community's effort as  
an amateurish and pointless waste of time as a result.

Will you all please stop it and go away, if only to allow what little  
real discussion is required to proceed? Any subscriber who you might  
*want* to be your audience has you in a killfile, and everybody else  
reading this in archives or as a newcomer is being exposed to your  
infantile quibbling as a first taste of what PG workers are like.

Nobody cares now, nor will they in the future, about this fiddling  
nonsense. If it's so damned important, go *do* it and stop asking for  
pats on the head or acknowledgment of your genius. Go settle this  
elsewhere, please.

Is nobody administering this list? To what end, exactly? If it's  
merely been set up as flypaper for bowerbird, then please put that in  
the headers so we can at least let it simmer here out of public view.

bowerbird and respondents: Just go write in one another's blog  
comments, please.

Please!
-----
Bill Tozier
AIM:    vaguery@mac.com
blog:   http://williamtozier.com/slurry
plazes: http://beta.plazes.com/user/BillTozier
skype:  vaguery

"Nature, however picturesque, never yet made a poet of a dullard."
   --Hjalmar Hjorth Boyesen


From prosfilaes at gmail.com  Fri Oct 20 05:18:18 2006
From: prosfilaes at gmail.com (David Starner)
Date: Fri Oct 20 05:18:22 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <c42.60d9f4a.326976f9@aol.com>
References: <c42.60d9f4a.326976f9@aol.com>
Message-ID: <6d99d1fd0610200518x41627ac0r19a9ca01f54a8ec0@mail.gmail.com>

On 10/19/06, Bowerbird@aol.com <Bowerbird@aol.com> wrote:
>  they'll be decided by users out in the real world,
>  making decisions based on costs and benefits...

Out in the real world, many groups that transcribe texts, like Oxford,
use TEI. How many use ZML?
From cannona at fireantproductions.com  Fri Oct 20 07:57:38 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Fri Oct 20 07:58:14 2006
Subject: [gutvol-d] what it all boils down to
References: <cc3.1347d28.3269daa9@aol.com>
Message-ID: <003b01c6f458$2c62f520$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I wrote:
> Get it?

Alas, it appears he did not.

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: <Bowerbird@aol.com>
To: <gutvol-d@lists.pglaf.org>; <Bowerbird@aol.com>
Sent: Friday, October 20, 2006 2:54 AM
Subject: re: [gutvol-d] what it all boils down to


> when i "call" you,
> you're supposed to
> show me your cards,
> not tell me a story
> about a shark...
>
> i think you are now
> _firmly_ on the record
> that z.m.l. won't work...
>
> thanks for putting yourself
> _firmly_ on the record, aaron.
> time will tell...   yes, time will tell...
>
> as it is, you exhausted my patience
> -- which is a difficult thing to do! --
> so i won't likely be responding to you
> again any time soon, just so you know.
>
> -bowerbird
>


- --------------------------------------------------------------------------------


> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOOQII7J99hVZuJcRArKjAKC1Q7ssBPqDJEIPkYlplXVjwjMqkgCg1yYG
kzF/f5pAb0YUrjxahvjGGEM=
=QbBH
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Fri Oct 20 09:00:00 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 09:00:11 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <cd1.b00947.326a4c80@aol.com>

bill, the only time there is "needless bickering" on this list
is when my detractors instigate it.   as you might or might not
be aware, they'd been quiet for quite a while, and it was peaceful
here in the land of gutvol-d.   it was only this week they came out
and raised a big fuss again...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/d34051a2/attachment.html
From Bowerbird at aol.com  Fri Oct 20 09:14:13 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 09:14:28 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <cce.b0672a.326a4fd5@aol.com>

david said:
>    Out in the real world, many groups that transcribe texts, 
>    like Oxford, use TEI. How many use ZML?

out in the real real world, most people use .pdf as their format.
sad but true.

it's true that some entities -- i would used university of virginia
as my example, but oxford is fine too -- use .tei, that is true...

(of course, you might want to notice that none of those entities
has one-tenth the traction that project gutenberg has.   why not?)

anyway, some people use docbook, some roll their own format.

there's certainly no abundance of experts in any of these
esoteric formats, certainly none that have shown up _here_
to volunteer their expertise and ease the markup burden,
which is why the .tei effort here is moving along so glacially.

meanwhile, a new move to no-markup authoring is proceeding
at an ever-increasing speed out in the "real real world" where
"user-generated content" is among the buzzwords of the day.
because there'd be no faster way to kill the buzz than to require
plain ordinary people to deal with heavy markup to contribute...

markdown has a big following, with a crew of developers in tow,
and served as the model for the "crossmark" function that o.l.p.c.
is developing.   wiki-formatting is huge, even if we look no farther
than wikipedia itself.   and with the advent of wysiwyg authoring
right in the webpage, with innovations like "writely" and the rest,
the days of heavy markup that writers need to deal with is going
to come to an end -- a very sudden end -- and that will be soon.

so if you're counting on "installed base" as your best argument
for heavy markup, you're putting yourself on very shaky ground.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/c8846902/attachment.html
From Bowerbird at aol.com  Fri Oct 20 09:19:11 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 09:19:23 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
Message-ID: <380.f7cf57f.326a50ff@aol.com>

marcello said:
>    Oh! And this actually takes the file name from the command line instead
>    of hardcoding it into the program. How to extract the title out of the
>    file is left as an exercise for the 4th grader.
>
>    #!/usr/bin/perl
>    # slurp whole file, mem is cheap
>    undef $/;
>    $text = <>;
>    print <<HERE;
>    <!DOCTYPE html public "-//w3c//dtd html 4.01 transitional//en">
>    ? <head>
>    ? ? <meta http-equiv="content-type" content="text/html; 
charset=iso-8859-1">
>    ? ? <title>my antonia!</title>
>    ? </head>
>? 
>? ? $text

believe it or not, we're getting something constructive out of marcello!
that's amazing!   open-source really _is_ transformative, isn't it!         
:+)

ok, sometimes -- when you make a script available for people to use --
the ability to get the filename from the command-line is good practice.
so thank you, marcello, for this excellent example of how to do that...

other times, when the script is sitting on your website and to be called,
you'll want it to read the parameters as passed from the calling script.
so, for instance, if we were to pass the filename starting in column 12:
>    read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
>    $thefilename=substr($buffer,12);

we're getting a little ahead of ourselves right now, it's best to take this
one step at a time, but since marcello did good, we wanna reward him.

a philosophical mantra of perl is "there is more than one way to do it",
so don't hesitate to provide your input on any of these code examples.
that's what the whole "many eyeballs" thing is all about, folks!

-bowerbird

p.s.   marcello also weighed in with a good reply on the "split" i asked for,
and we'll get to that in coming days, i promise.   the message there was
"sometimes the bug ain't where you think it is, so keep your mind open."
but more on that later...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f62b63f2/attachment.html
From Bowerbird at aol.com  Fri Oct 20 09:19:23 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 09:19:40 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c53.5ce4f7b.326a510b@aol.com>


brett, you've exhausted my patience as well.   adieu.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/af5fe5d1/attachment.html
From sam.bretheim at gmail.com  Fri Oct 20 09:24:56 2006
From: sam.bretheim at gmail.com (Sam Bretheim)
Date: Fri Oct 20 09:33:13 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
 format and process debate
Message-ID: <4538F858.8020809@gmail.com>

I propose that we create a new mailing list, perhaps called something 
like gutvol-alternatives-d, gutvol-debate-d, or gutvol-formats-d, in 
order to quarantine debate on "revolutionary" digitization approaches 
other than the standard PG and PGDP production processes.  This proposal 
is not an attempt to crush dissent and innovation; rather, it is 
intended to decrease the signal-to-noise ratio on this forum, for the 
benefit of the many volunteers who have no interest in the matter.

From desrod at gnu-designs.com  Fri Oct 20 09:42:21 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Fri Oct 20 09:43:34 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for
	PG-related  format and process debate
In-Reply-To: <4538F858.8020809@gmail.com>
References: <4538F858.8020809@gmail.com>
Message-ID: <1161362541.6048.0.camel@localhost.localdomain>

On Fri, 2006-10-20 at 10:24 -0600, Sam Bretheim wrote:
> This proposal is not an attempt to crush dissent and innovation;
> rather, it is intended to decrease the signal-to-noise ratio on this
> forum, for the benefit of the many volunteers who have no interest in
> the matter. 

I concur, great idea.

-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/e9b267d3/attachment-0001.bin
From desrod at gnu-designs.com  Fri Oct 20 09:47:46 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Fri Oct 20 09:48:33 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <cce.b0672a.326a4fd5@aol.com>
References: <cce.b0672a.326a4fd5@aol.com>
Message-ID: <1161362866.6048.3.camel@localhost.localdomain>

On Fri, 2006-10-20 at 12:14 -0400, Bowerbird@aol.com wrote:
> >   Out in the real world, many groups that transcribe texts, 
> >   like Oxford, use TEI. How many use ZML?

> out in the real real world, most people use .pdf as their format.
> sad but true. 

People don't author PDF files, they SaveAs/Export/Print to PDF format
files, but their source material is in some other format... HTML, XML,
Microsoft Word, OpenOffice.org and so on. I think you're confusing the
two... 

There is no reason that I can see, why ZML, XML, TEI, TeX, foo, bar and
blort formats can't all support the same final output, since the end
users will never have to interact with the original source material that
was used to produce them. 

-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/0096532a/attachment.bin
From Bowerbird at aol.com  Fri Oct 20 09:50:16 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 09:50:35 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
Message-ID: <c56.49eae21.326a5848@aol.com>

keith said:
>   Do to its nature multi-line parsing or splitting is not quite that easy

maybe.   but we'll make it work.          :+)


>    split is nice. But you want to be doing parsing

on a personal note, any time i call what i'm doing "parsing",
it goes badly.   but as soon as i call it by _another_ name,
the same thing with the same code, it starts working better.
so i've grown allergic to that word, and i almost never use it.       :+)

however, i assume that you're talking about "parsing" in the
"let's parse the dom tree" sense.   (no, i don't even know what 
that means, so i might well have misused it, which would be
poetic in its own way.)

that kind of "parsing" would make our code very complicated.

and in the same way that i don't like my format to be complex,
i don't like my programs to be complex.   so i make them simple.

and what i'm showing people here is how much mileage can be
obtained out of the simple combination of a simple format and
some simple programs.   that's the whole purpose of this exercise.

so just stick with me for a little bit on "split", and see some tricks.

(and, just to be clear, although you might think this is related to
z.m.l., and thus can be swiftly relegated to the "i don't care" pile,
the truth of the matter is that since virtually all of the books in
the p.g. library have a plain-ascii representative, one that is close
to z.m.l. format and perhaps even exact, the code that i'm showing
here could also be used to great effect on the library as it stands.
there are a lot of neat features that could be offered with very little
work or trouble by using the simple code routines i'll reveal here.
just as an example, how about a simple script that would give us
a list of the section-headers for every book in the entire library?
i don't know about you, but i think this "super table of contents"
for the entire library would be very cool, and likely quite useful.
and within a week or two of these little daily lessons, we'll have it.)


>    Just for the fun of it your script is incomplete.

good observation, keith.   now tell us why, and complete it...       ;+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/f8f526fe/attachment.html
From joshua at hutchinson.net  Fri Oct 20 09:51:43 2006
From: joshua at hutchinson.net (joshua@hutchinson.net)
Date: Fri Oct 20 09:52:02 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for
	PG-related  format and process debate
Message-ID: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net>

That'd be great except for one thing.  We tried creating a separate 
forum for this stuff before and the birdie boy ignored it and kept on 
posting his flamebait to gutvol-d.  The only thing that ever works 
(outside outright banning him, which we've also done in the past) is 
for everyone to ignore him until he gets bored (but he always comes 
back again later).

The problem is this: bowerbird is probably one of the most proficient 
and accomplished trolls I've run across in 15+ years on the Internet.  
He's definitely the only one I've ever promised to punch in the mouth 
if we ever meet in person!  He will eventually rant idiocy about 
something someone truly cares about and then the flames will start all 
over again.

Josh

>----Original Message----
>From: desrod@gnu-designs.com
>Date: Oct 20, 2006 12:42 
>To: "Project Gutenberg Volunteer Discussion"<gutvol-d@lists.pglaf.
org>
>Subj: Re: [gutvol-d] Proposal: creation of new mailing list for	PG-
related  format and process debate
>
>On Fri, 2006-10-20 at 10:24 -0600, Sam Bretheim wrote:
>> This proposal is not an attempt to crush dissent and innovation;
>> rather, it is intended to decrease the signal-to-noise ratio on 
this
>> forum, for the benefit of the many volunteers who have no interest 
in
>> the matter. 
>
>I concur, great idea.
>
>-- 
>David A. Desrosiers
>desrod@gnu-designs.com
>http://gnu-designs.com
>
>"Erosion of civil liberties... is a threat to national security."
>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d
>


From bill at williamtozier.com  Fri Oct 20 09:53:53 2006
From: bill at williamtozier.com (William Tozier)
Date: Fri Oct 20 09:54:05 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <cd1.b00947.326a4c80@aol.com>
References: <cd1.b00947.326a4c80@aol.com>
Message-ID: <596C2A52-74DC-4F7C-B898-816FF9D27AA5@williamtozier.com>


On Oct 20, 2006, at 12:00 PM, Bowerbird@aol.com wrote:

> bill, the only time there is "needless bickering" on this list
> is when my detractors instigate it.  as you might or might not
> be aware, they'd been quiet for quite a while, and it was peaceful
> here in the land of gutvol-d.  it was only this week they came out
> and raised a big fuss again...

Then it is the better part of valor to ignore them. Clearly, if the  
only traffic is people picking on you, and the best course to  
undermine trolls' behavior is to ignore them, then you are best  
served by ignoring your detractors.

Giving in to the impulse to correct their misstatements immediately  
is exactly what they're looking for. Any reasonable person will be  
able to read the archives and see that they are simply throwing  
sticks and stones, and that (if you cease posting in response) you're  
the only one providing actual useful content.

Can't lose by staying quiet. Give it a shot.
-----
Bill Tozier
AIM:    vaguery@mac.com
blog:   http://williamtozier.com/slurry
plazes: http://beta.plazes.com/user/BillTozier
skype:  vaguery

"Nature, however picturesque, never yet made a poet of a dullard."
   --Hjalmar Hjorth Boyesen


From bill at williamtozier.com  Fri Oct 20 09:54:21 2006
From: bill at williamtozier.com (William Tozier)
Date: Fri Oct 20 09:54:26 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <4538F858.8020809@gmail.com>
References: <4538F858.8020809@gmail.com>
Message-ID: <70F41ED1-6EC2-4A83-AA1A-F5D416BC4C7A@williamtozier.com>


On Oct 20, 2006, at 12:24 PM, Sam Bretheim wrote:

> I propose that we create a new mailing list, perhaps called  
> something like gutvol-alternatives-d, gutvol-debate-d, or gutvol- 
> formats-d, in order to quarantine debate on "revolutionary"  
> digitization approaches other than the standard PG and PGDP  
> production processes.  This proposal is not an attempt to crush  
> dissent and innovation; rather, it is intended to decrease the  
> signal-to-noise ratio on this forum, for the benefit of the many  
> volunteers who have no interest in the matter.

God yes.
-----
Bill Tozier
AIM:    vaguery@mac.com
blog:   http://williamtozier.com/slurry
plazes: http://beta.plazes.com/user/BillTozier
skype:  vaguery

"Nature, however picturesque, never yet made a poet of a dullard."
   --Hjalmar Hjorth Boyesen


From Bowerbird at aol.com  Fri Oct 20 10:14:31 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 10:14:40 2006
Subject: [gutvol-d] re: to punch in the mouth
Message-ID: <c13.79d1f51.326a5df7@aol.com>

josh said:
>    He's definitely the only one I've ever promised 
>    to punch in the mouth if we ever meet in person!?

are you serious?

i mean, really?

because that's _funny_.

anyway, i can't promise that i won't punch you right back,
but violence is _so_ 20th-century...

much better to just have you thrown in jail, i guess.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/e84f39fc/attachment.html
From marcello at perathoner.de  Fri Oct 20 10:24:25 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 10:24:29 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <cce.b0672a.326a4fd5@aol.com>
References: <cce.b0672a.326a4fd5@aol.com>
Message-ID: <45390649.7080609@perathoner.de>

Bowerbird@aol.com wrote:

> wiki-formatting is huge, even if we look no farther
> than wikipedia itself.

Then why do you invent a new format that is inferior to wiki?

Everybody already knows wiki, so go with that.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Fri Oct 20 10:26:15 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 10:26:18 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <380.f7cf57f.326a50ff@aol.com>
References: <380.f7cf57f.326a50ff@aol.com>
Message-ID: <453906B7.6070700@perathoner.de>

Bowerbird@aol.com wrote:

> believe it or not, we're getting something constructive out of marcello!

Now we just need to figure out how to get the same out of you!


-- 
Marcello Perathoner
webmaster@gutenberg.org

From marcello at perathoner.de  Fri Oct 20 10:34:03 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 10:34:10 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <380.f7cf57f.326a50ff@aol.com>
References: <380.f7cf57f.326a50ff@aol.com>
Message-ID: <4539088B.2070009@perathoner.de>

Bowerbird@aol.com wrote:

> other times, when the script is sitting on your website and to be called,
> you'll want it to read the parameters as passed from the calling script.
> so, for instance, if we were to pass the filename starting in column 12:
>>    read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
>>    $thefilename=substr($buffer,12);

You are just cutting and pasting out of some perl cgi tutorial. You
don't have the least idea what is going on.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Fri Oct 20 10:35:05 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 10:35:25 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <cb0.13f46a1.326a62c9@aol.com>

william said:
>    Can't lose by staying quiet. Give it a shot.

if by "staying quiet", you mean "stop responding to trolls",
well, i've done just that, today, with both aaron and brett...

and josh -- with a _repeat_ of a promise to "punch" me --
has probably earned himself that distinction now as well...

and after i complimented a post of his, and replied to it
_in_detail_ (which got absolutely no response from him),
sam is now suggesting that i be exiled off to another list.
um, i guess the message is that if you can't answer a critic,
you should have him silenced.   that seems to be in vogue
with our president these days, but here on gutvol-d too?

so yeah, in some ways it looks pretty bleak here.

on the other hand, marcello, who has had a signal-to-noise
ratio of about 3/997 in the past, came through with _two_ (2)
constructive posts today.   (oh sure, they had trollish language,
but nonetheless they were _constructive_ in the sense that they
were on-point and added a relevant point.   what more can i ask?)

and in spite of his occasional resort into ad hominem land,
david consistently drags in some good arguments as well...

and keith is kicking in some good stuff too.

so i don't think gutvol-d is the wasteland you've described.

would it be better if people were civil to me?   undoubtedly.
because then i could continue to be my normal civil self...

and would it be better if people who can't be civil to me would
simply not respond to me?   absolutely.   i loved the peace and
quiet that has been the normal mode around here for months.

in fact, i _encourage_ those people who cannot resist responding
to me to put me in their kill-files and not even _read_ my posts;
i'm not here for the conflict.   frankly, i think conflict is _stupid_...

***

on the other hand, if you mean that i should stop speaking,
simply because some people occasionally jump all over me,
let me just say that that's not likely to happen, bill.   not at all.

of course, if the president happens to pull me off the street
as "a suspected terrorist", then you'll stop hearing from me.
(i can see josh reaching for the phone now to call the c.i.a.)

but barring that, i'll post here regularly, with relevant thoughts.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/c062df45/attachment-0001.html
From marcello at perathoner.de  Fri Oct 20 10:36:08 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 10:36:12 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <4538F858.8020809@gmail.com>
References: <4538F858.8020809@gmail.com>
Message-ID: <45390908.4020705@perathoner.de>

Sam Bretheim wrote:

> I propose that we create a new mailing list, perhaps called something
> like gutvol-alternatives-d, gutvol-debate-d, or gutvol-formats-d, in
> order to quarantine debate on "revolutionary" digitization approaches
> other than the standard PG and PGDP production processes.  This proposal
> is not an attempt to crush dissent and innovation; rather, it is
> intended to decrease the signal-to-noise ratio on this forum, for the
> benefit of the many volunteers who have no interest in the matter.

We already have that: gutvol-p

Bowerbird got moderated on gutvol-p so he took over this one.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Fri Oct 20 11:00:32 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 11:00:42 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <c51.5a85e1f.326a68c0@aol.com>

david said:
>    People don't author PDF files, they SaveAs/Export/Print to PDF format 
files, 
>    but their source material is in some other format... HTML, XML,
>    Microsoft Word, OpenOffice.org and so on. 

the point is, the thing they distribute to their readers is a .pdf.
(and hey, david, i don't like that fact any better than you do.)


>    I think you're confusing the two...

no, i'm clear.   when most entities out there in the real world
make a file available to other entities, they do it using a .pdf.

your distinction between "authoring" and "save/export/print to pdf"
is an arbitrary one.   we could make the same point about .html,
with some people composing it in dreamweaver, others in notepad,
others in microsoft word or openoffice, and so on.   but the point is
that they're using .html as the vehicle.   (and in most cases, they will
put .html material on the web itself, rather than distribute it as files.
given _that_ view of things, then .html is a bigger vehicle than .pdf.)


>    There is no reason that I can see, 
>    why ZML, XML, TEI, TeX, foo, bar and blort formats 
>    can't all support the same final output

wow.   did you really mean to include .zml in that list?
i mean, i agree with the statement as you wrote it, but
it would be a major concession for you to say that .zml
can do everything .xml can do...   or do i misunderstand?


>    since the end users will never have to interact with 
>    the original source material that was used to produce them.

that is one view of things, that the "master version" is one that is
never shared with users, that all they receive is derivative versions.

personally, i'd rather _empower_ users by giving them the "master".

and i'd like to further empower users by giving them conversion tools,
so they could generate all the derivative versions themselves, without
any need for ever having to consult with me at any time down the line.

(people seem to _expect_ that the web "will always be there", but i know
that an evil president could shut the thing down in the blink of an eye,
and i am _not_ so naive as to believe we'll never have an evil president.
due to this, i want to distribute the books in our library far and wide.)

even _more_, i'd like to empower users by giving them authoring and
viewing tools that work directly on the "master version", so they would
have no need to have to ever bother with generating derivative formats.
(part and parcel of this is giving them a format that's easy to understand.)
this independence is (for me) the only long-range plan worth supporting.

i'm guessing that if i _can_ empower users in these ways, that they will
come to me, instead of going to someone that wants to be their master
by hoarding the "master version".   you might see it differently.   viva 
la...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/bc3beb4a/attachment.html
From lee at novomail.net  Fri Oct 20 11:04:19 2006
From: lee at novomail.net (Lee Passey)
Date: Fri Oct 20 11:02:49 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <A2D00D2F-F826-48C2-A1CF-37BE89A99CB2@uni-trier.de>
References: <c10.7616f00.326884df@aol.com>
	<A2D00D2F-F826-48C2-A1CF-37BE89A99CB2@uni-trier.de>
Message-ID: <45390FA3.4050502@novomail.net>

Schultz Keith J. wrote:

> Hi There,
> 
> There are already such tools availible commercially!!
> 
> It has been around for a long time TeX and LaTeX. Textures is a
> WYSIWYG system and authouring Tool.
> 
> LaTeX can be easily converted to pdf, html, xml, docbook, etc.
> 
> As Bowerbird mentioned in another thread why reinvent the wheel or
> try to.
> 
> Just my two Euro cents worth!
> 
> regards
 > Keith.
> 
> P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics,
> footnotes, layout control, indices, bibliographie, multi-language,
> right-left, left-right., ASCII, UniCode, pratically platform
> independent. There are freeware versions, but they are generally not
> WYSIWYG, thereby having at first a stiff learning curve.
> 
> Keith.

I certainly agree that we should adopt as much from existing 
formats/software as possible. If you look at Mr. Noring's original call 
to action that has Bowerbird so agitated 
(http://groups.yahoo.com/group/ebook-community/message/26923) you will 
see that it is, at this point, more an attempt to gather requirements 
than to specify a design.

Now LaTeX is advertised as a "typesetting language," so I had originally 
dismissed it as a presentational language, whereas what we are looking 
for is a way to markup a document structure without specifying the 
presentation. Prompted by your message, I went to the Internet and 
looked at LaTeX a little more closely. I discovered that I was wrong.

LaTeX is document structure markup, not document presentation markup, 
apparently almost as powerful as TEI, and probably more powerful than 
XHTML. I think it could work as a master format for an authoring tool.

However, in the spirit of not re-inventing wheels, I would like to reuse 
not only an existing format, but also existing code and tools as well. 
TeK isn't usefully in this context because it is a printer driver; we're 
not interested in printing, we're interested in conversion from the 
master format to multiple e-book formats. Are there other tools or 
available code which we could re-use? My current bias is to start with a 
subset of TEI because 1. it is well-understood and well-established, and 
2. there are lots of tools and available code implementations to 
manipulate XML files. Nonetheless, I could be persuaded to go with 
LaTeX, and would love to here the arguments in its favor.

I do note that this discussion is somewhat tangential to the mission of 
Project Gutenberg, which is to make available simplified text versions 
of public domain works, so I would invite everyone interested to 
continue the conversation on the ebook-community discussion list.

-- 
Nothing of significance below this line.

From Bowerbird at aol.com  Fri Oct 20 11:09:00 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 11:09:08 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
Message-ID: <bc6.55a2967.326a6abc@aol.com>

marcello is trying to overwhelm your e-mailbox with posts, 
so everyone gives up and stops reading all of these threads.
it's a crude technique, but it does work.   thus, to counter it,
i've responded to several of his messages in this one reply.

***

marcello said:
>    Then why do you invent a new format that is inferior to wiki?

because i like to tinker.   besides, z.m.l. is _not_ "inferior" to wiki.
it might not be "superior" either, but it's definitely not "inferior".
it's just a different take on the same general idea.   since it is a
rather new idea, it's good to experiment with many approaches.


>    Everybody already knows wiki, so go with that.

i'm not the type of person who "goes" with "everybody".
but thanks for the suggestion.


>   Now we just need to figure out how to get the same out of you!

except you are playing in the sandbox of my thread.
let's see _you_ start a thread that people care about.

95% of your posts over the last 3 years have been
a _direct_ reply to a point of _mine_.   it's as if you
have no life at all except the one that i give to you.
i'm the rain, and you're the poisonous mushrooms.

c'mon, man, develop a _spine_, for crying out loud...


>   You are just cutting and pasting out of some perl cgi tutorial. 

is that a bad thing?

that's how most programmers start learning.

and yep, i'm just a beginner with perl.

yet i'm going to show you how much power
can be realized by even a beginner like me
-- given a nice, simple format like .zml --
with nothing but a rudimentary knowledge
of a dozen or so concepts and commands...

that's the whole point of this little exercise...


>    You don't have the least idea what is going on.

i don't think it's too hard to figure those lines out.
would you like me to explain them to you and then
you can tell me if i got it right?, because i can do that.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/217103eb/attachment.html
From desrod at gnu-designs.com  Fri Oct 20 11:12:32 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Fri Oct 20 11:13:37 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <bc6.55a2967.326a6abc@aol.com>
References: <bc6.55a2967.326a6abc@aol.com>
Message-ID: <1161367952.6048.5.camel@localhost.localdomain>

On Fri, 2006-10-20 at 14:09 -0400, Bowerbird@aol.com wrote:
> thus, to counter it, i've responded to several of his messages in this
> one reply. 

But you've retained the Reply-To and MessageID headers from only one
thread, so this reply will get buried deep inside only one thread, so
the context in the other threads will be lost. That is the whole point
of threading... and now you've co-opted it for reasons which I cannot
seem to understand. 


-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/02b7f7d0/attachment.bin
From lee at novomail.net  Fri Oct 20 11:19:28 2006
From: lee at novomail.net (Lee Passey)
Date: Fri Oct 20 11:17:57 2006
Subject: [gutvol-d] what it all boils down to
In-Reply-To: <c7c.99fa62.32691fd3@aol.com>
References: <c7c.99fa62.32691fd3@aol.com>
Message-ID: <45391330.4030702@novomail.net>

(I'm probably going to regret this, but ...)

Bowerbird@aol.com wrote:

> or "indent lines of the poem however much you want them to
> be indented, but use at least one space of indentation so we
> know that it's a _block_ and that it shouldn't be re-wrapped"...

Just out of idle curiosity, in ZML how do you mark up a block quotation, 
that is, one or more full or partial paragraphs quoted from another 
source, which are typically block offset but which should nevertheless 
be word-wrapped?

-- 
Nothing of significance below this line.

From marcello at perathoner.de  Fri Oct 20 11:18:42 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 11:18:46 2006
Subject: [gutvol-d] pulled back from the brink to live yet another day
In-Reply-To: <bc6.55a2967.326a6abc@aol.com>
References: <bc6.55a2967.326a6abc@aol.com>
Message-ID: <45391302.8040909@perathoner.de>

Bowerbird@aol.com wrote:

>>    You don't have the least idea what is going on.
> 
> i don't think it's too hard to figure those lines out.
> would you like me to explain them to you and then
> you can tell me if i got it right?, because i can do that.

Yes. Please explain. Make my day.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Fri Oct 20 11:19:36 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 11:19:48 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
Message-ID: <c76.2e2c88c.326a6d38@aol.com>

lee said:
>    Prompted by your message, I went to the Internet and
>    looked at LaTeX a little more closely. I discovered that I was wrong.

see, william, here's another good thing.   lee has discovered latex,
and it happened because he was involved in a thread right here...


>    Nonetheless, I could be persuaded to go with LaTeX, 
>    and would love to here the arguments in its favor.

hear here!


>    I do note that this discussion is somewhat tangential to 
>    the mission of Project Gutenberg, which is to make 
>    available simplified text versions of public domain works

founder michael hart doesn't define the mission that narrowly.

he's in favor of whatever formats -- in addition to plain text --
people want, so a multi-format converter is right on-topic here.

indeed, that is the impetus for the whole movement to .tei, or
at least it was originally, that it could generate multiple formats.


>    so I would invite everyone interested to continue the 
>    conversation on the ebook-community discussion list.

for all the people who want to get away from bowerbird,
jon noring's listserve is _the_ place to do that, yes sir!          :+)
yep, jon banned me from there many many years ago...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/fd1764d4/attachment-0001.html
From jon at noring.name  Fri Oct 20 11:24:26 2006
From: jon at noring.name (Jon Noring)
Date: Fri Oct 20 11:31:27 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
In-Reply-To: <45390FA3.4050502@novomail.net>
References: <c10.7616f00.326884df@aol.com>
	<A2D00D2F-F826-48C2-A1CF-37BE89A99CB2@uni-trier.de>
	<45390FA3.4050502@novomail.net>
Message-ID: <891243459.20061020122426@noring.name>

Lee wrote:

> I certainly agree that we should adopt as much from existing 
> formats/software as possible. If you look at Mr. Noring's original call
> to action that has Bowerbird so agitated 
> (http://groups.yahoo.com/group/ebook-community/message/26923) you will
> see that it is, at this point, more an attempt to gather requirements 
> than to specify a design.

Yes, my original TeBC message was essentially a requirements gathering
process.

Also note that I did not post the call for requirements to gutvol-* since
I deemed it to be mostly off-topic to gutvol. It was Bowerbird who
dragged that into gutvol-* and from there the fun began.

Jon Noring

From Bowerbird at aol.com  Fri Oct 20 11:40:16 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 11:40:23 2006
Subject: [gutvol-d] what it all boils down to
Message-ID: <bee.6dca589.326a7210@aol.com>

lee said:
>    Just out of idle curiosity, in ZML how do you mark up a block quotation,
>    that is, one or more full or partial paragraphs quoted from another 
source, 
>    which are typically block offset but which should nevertheless be 
word-wrapped?

if you have questions about the rules of .zml, you should go review them:
>    http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt
(especially on days like the last week, when message traffic is so heavy 
here.)

if you still have questions after reading the rules, then i'll be happy to 
help.

but i suspect that this isn't merely "idle curiosity".           :+)

my guess is that you're trying to ask when word-wrapping _will_ occur,
and when it will not.   or, put another way, how can an author specify that
word-wrapping _should_ be done, as opposed to when it should _not_...

if that's what you _really_ want to know, lee, then ask it directly, ok?,
and i'll be happy to tell you, assuming the answer is not clear to you
once you've read the rules.   (i honestly can't remember if it's clear in
the version of the rules that is posted, which might be outdated now,
so maybe you can tell me that.   i might well have considered the case
where rewrapping is _wanted_ to be "too advanced" for the rules then;
and part of the reason for +that_ is that "rewrapping" might not mean
the same thing in your mind as it means in the display-world of .zml...
but that's getting _far_ too removed from any relevance to gutvol-d.)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/221ad601/attachment.html
From marcello at perathoner.de  Fri Oct 20 11:49:22 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Oct 20 11:49:25 2006
Subject: [gutvol-d] what it all boils down to
In-Reply-To: <bee.6dca589.326a7210@aol.com>
References: <bee.6dca589.326a7210@aol.com>
Message-ID: <45391A32.1050404@perathoner.de>

Bowerbird@aol.com wrote:

> if that's what you _really_ want to know, lee, then ask it directly, ok?,
> and i'll be happy to tell you, assuming the answer is not clear to you
> once you've read the rules.   (i honestly can't remember if it's clear in
> the version of the rules that is posted, which might be outdated now,
> so maybe you can tell me that.   i might well have considered the case
> where rewrapping is _wanted_ to be "too advanced" for the rules then;
> and part of the reason for +that_ is that "rewrapping" might not mean
> the same thing in your mind as it means in the display-world of .zml...
> but that's getting _far_ too removed from any relevance to gutvol-d.)

That's a long-winded way to say: blockquotes are not supported im ZML.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Fri Oct 20 12:13:43 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 20 12:13:50 2006
Subject: [gutvol-d] a new policy -- one post per day!
Message-ID: <305.472c68d1.326a79e7@aol.com>

beginning on monday of next week, i will adopt
a new policy, and post just one message per day.

on some days, it'll be a very _long_ post, to be sure,
if lotsa people have pitched flak at me the day before.

nonetheless, i will still make just _one_ post per day,
so as to sidestep the overflow-mailbox strategy that 
some of my detractors are forcing down our throats.

so let's get started, eh?   this is my last post for today.

-bowerbird

p.s.   and yes, we'll still do the open-source project!
i'll just work it into my one-post-per-day regime...
have a nice weekend!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061020/8206aa24/attachment.html
From lee at novomail.net  Fri Oct 20 12:37:05 2006
From: lee at novomail.net (Lee Passey)
Date: Fri Oct 20 12:35:37 2006
Subject: [gutvol-d] what it all boils down to
In-Reply-To: <bee.6dca589.326a7210@aol.com>
References: <bee.6dca589.326a7210@aol.com>
Message-ID: <45392561.1010709@novomail.net>

Bowerbird@aol.com wrote:

> lee said:
>> Just out of idle curiosity, in ZML how do you mark up a block 
>> quotation, that is, one or more full or partial paragraphs quoted 
>> from another source, which are typically block offset but which
>> should nevertheless be word-wrapped?
> 
> if you have questions about the rules of .zml, you should go review 
> them:
> http://snowy.arsc.alaska.edu/bowerbird/test-suite/zml11rules.txt
> (especially on days like the last week, when message traffic is so 
> heavy here.)
> 
> if you still have questions after reading the rules, then i'll be 
> happy to help.
> 
> but i suspect that this isn't merely "idle curiosity".          :+)
> 
> my guess is that you're trying to ask when word-wrapping _will_ 
> occur, and when it will not.  or, put another way, how can an author
> specify that word-wrapping _should_ be done, as opposed to when it 
> should _not_..

Actually, what I'm really trying to ask is, in ZML how do you mark up a 
block quotation? The only mention of blocks in your ZML description file 
is in ruleset 4, where you suggest that no "line" which begins with a 
whitespace will be wrapped (I simply assume that, absent indications to 
the contrary, all other text will be wrapped).

Maybe you could make a /new/ rule that block quotes are a collection of 
lines that begin with right angle bracket, as above? Then your ZML 
viewer program could detect that, remove the markup, and display the 
block quote according to the user's preferences.

-- 
Nothing of significance below this line.

From cannona at fireantproductions.com  Fri Oct 20 14:52:30 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Fri Oct 20 14:54:46 2006
Subject: [gutvol-d] meanwhile, i'm really excited!
References: <c10.7616f00.326884df@aol.com><A2D00D2F-F826-48C2-A1CF-37BE89A99CB2@uni-trier.de>
	<45390FA3.4050502@novomail.net>
Message-ID: <010601c6f492$5c7b39e0$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

One great thing about latex is that it is very very widely used.  I dare say
that the only mark up language that is more widely known is html and
possibly WIKI.  At least, that's my guess based on my university experience.
It seems that virtually all of the graduate level math, science,
engineering, statistics, and actuarial students know and use it to one
degree or another.  Also, it has wonderfully comprehensive support for math
and science equations.  On the other hand, it might be easier to parse xml.
Also, as mediawiki has shown, it is quite easy to take the best latex has to
offer (I.E. it's ability to represent complex equations) and add that to
nearly any other format.

Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: "Lee Passey" <lee@novomail.net>
To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org>
Sent: Friday, October 20, 2006 1:04 PM
Subject: Re: [gutvol-d] meanwhile, i'm really excited!


> Schultz Keith J. wrote:
>
>> Hi There,
>>
>> There are already such tools availible commercially!!
>>
>> It has been around for a long time TeX and LaTeX. Textures is a
>> WYSIWYG system and authouring Tool.
>>
>> LaTeX can be easily converted to pdf, html, xml, docbook, etc.
>>
>> As Bowerbird mentioned in another thread why reinvent the wheel or
>> try to.
>>
>> Just my two Euro cents worth!
>>
>> regards
> > Keith.
>>
>> P.S. LaTeX is mark-up, Has chapters, paragraphs, formulas, graphics,
>> footnotes, layout control, indices, bibliographie, multi-language,
>> right-left, left-right., ASCII, UniCode, pratically platform
>> independent. There are freeware versions, but they are generally not
>> WYSIWYG, thereby having at first a stiff learning curve.
>>
>> Keith.
>
> I certainly agree that we should adopt as much from existing
> formats/software as possible. If you look at Mr. Noring's original call to
> action that has Bowerbird so agitated
> (http://groups.yahoo.com/group/ebook-community/message/26923) you will see
> that it is, at this point, more an attempt to gather requirements than to
> specify a design.
>
> Now LaTeX is advertised as a "typesetting language," so I had originally
> dismissed it as a presentational language, whereas what we are looking for
> is a way to markup a document structure without specifying the
> presentation. Prompted by your message, I went to the Internet and looked
> at LaTeX a little more closely. I discovered that I was wrong.
>
> LaTeX is document structure markup, not document presentation markup,
> apparently almost as powerful as TEI, and probably more powerful than
> XHTML. I think it could work as a master format for an authoring tool.
>
> However, in the spirit of not re-inventing wheels, I would like to reuse
> not only an existing format, but also existing code and tools as well. TeK
> isn't usefully in this context because it is a printer driver; we're not
> interested in printing, we're interested in conversion from the master
> format to multiple e-book formats. Are there other tools or available code
> which we could re-use? My current bias is to start with a subset of TEI
> because 1. it is well-understood and well-established, and 2. there are
> lots of tools and available code implementations to manipulate XML files.
> Nonetheless, I could be persuaded to go with LaTeX, and would love to here
> the arguments in its favor.
>
> I do note that this discussion is somewhat tangential to the mission of
> Project Gutenberg, which is to make available simplified text versions of
> public domain works, so I would invite everyone interested to continue the
> conversation on the ebook-community discussion list.
>
> --
> Nothing of significance below this line.
>
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOUWoI7J99hVZuJcRAv1nAKDQPXQs3qSMNUtN+xUEbOXJyuDcpACePVcY
H2LydfP+xbkj5/5oDMREBTA=
=xvGc
-----END PGP SIGNATURE-----

From scott_bulkmail at productarchitect.com  Fri Oct 20 21:11:29 2006
From: scott_bulkmail at productarchitect.com (Scott Lawton)
Date: Sat Oct 21 00:44:33 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for 
	PG-related  format and process debate
In-Reply-To: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net>
References: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net>
Message-ID: <p06110415c15f4c8e7100@[192.168.0.52]>

>The problem is this: bowerbird is probably one of the most proficient
>and accomplished trolls I've run across in 15+ years on the Internet. 

And, that Greg is too nice to ban him.  Or create gutvol-bb and ban him from every other gut list.

And, as you noted, that lots of people who should (IMHO) know better continue to reply to him.  I periodically check the [folder name censored] where my filter dumps his posts; I've yet to encounter a reason to disable the filter.
-- 

Cheers,

Scott S. Lawton
http://Classicosm.com/ - classic books
From cannona at fireantproductions.com  Sat Oct 21 04:25:16 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Sat Oct 21 04:25:32 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
References: <24517094.1161363103267.JavaMail.?@fh1039.dia.cp.net>
	<p06110415c15f4c8e7100@[192.168.0.52]>
Message-ID: <001d01c6f503$9a85e110$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

If I am not mistaken, Greg unbanned him because of a mandate, not by choice.
Or at least, that is what I recall he said on the matter when asked, and I
have no reason to doubt him.

Sincerely
Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: "Scott Lawton" <scott_bulkmail@productarchitect.com>
To: <gutvol-d@lists.pglaf.org>
Sent: Friday, October 20, 2006 11:11 PM
Subject: Re: [gutvol-d] Proposal: creation of new mailing list for
PG-related format and process debate


> >The problem is this: bowerbird is probably one of the most proficient
>>and accomplished trolls I've run across in 15+ years on the Internet.
>
> And, that Greg is too nice to ban him.  Or create gutvol-bb and ban him
> from every other gut list.
>
> And, as you noted, that lots of people who should (IMHO) know better
> continue to reply to him.  I periodically check the [folder name censored]
> where my filter dumps his posts; I've yet to encounter a reason to disable
> the filter.
> --
>
> Cheers,
>
> Scott S. Lawton
> http://Classicosm.com/ - classic books
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFOgOlI7J99hVZuJcRAvdXAJ9MPfw4/bs1samDYFsa7yAoCwl6agCgvB6V
A23mtZrAONIR14MHRlXBeBA=
=dbfD
-----END PGP SIGNATURE-----

From gbnewby at pglaf.org  Sat Oct 21 23:49:08 2006
From: gbnewby at pglaf.org (Greg Newby)
Date: Sat Oct 21 23:49:10 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <001d01c6f503$9a85e110$0300a8c0@blackbox>
Message-ID: <20061022064908.GA6749@pglaf.org>

On Sat, Oct 21, 2006 at 06:25:16AM -0500, Aaron Cannon wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> If I am not mistaken, Greg unbanned him because of a mandate, not by choice.
> Or at least, that is what I recall he said on the matter when asked, and I
> have no reason to doubt him.
> 
> Sincerely
> Aaron Cannon

BB was never banned, he was simply moderated.  When we moved to new
mailing list software, I unmoderated him.  At the time of moderation, I
listened to community pressure, and followed through after making
specific requests to BB for changes in behavior.  However, ultimately
individualized moderation is just not consistent with the overall way PG
operates (see the "About" essays Michael and I worked on last year at
www.gutenberg.org).

Sidenote: 
Chuck Mattsen has handled moderation for posted and pgww for
awhile, and needs to give up this duty.  I could use a few volunteers to
handle moderation of those lists, which have just a few subscribers but
allow posting (after a moderation decision) by anyone.  We also get a
lot of spam to the other lists, including -d, glibrary and the
newsletter lists, requiring moderation action.  Basically this involves
about 10 instances per day of spending a few seconds with the Mailman
web-based interface.  Email me if you monitor your email very
regularly, and might be able to help with moderation.

Back to topic:
I encourage people to take control of their own mailboxes.  If you don't
like reading postings from someone, filter them.  If you don't know how
to filter people in the email program you use, ask here and we'll help.
Many email programs can filter entire threads (by Subject line), but
filtering an individual's email is even easier.

I would prefer that Mailman (which otherwise is a capable and wonderful
mailing list manager) offered subscribers the option to elect not to see
messages from particular addresses, but that's not a currently available
feature.  Scott's recommendations, below, make a lot of sense to me.
  -- Greg

> - --
> Skype: cannona
> MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
> address.)
> - ----- Original Message -----
> From: "Scott Lawton" <scott_bulkmail@productarchitect.com>
> To: <gutvol-d@lists.pglaf.org>
> Sent: Friday, October 20, 2006 11:11 PM
> Subject: Re: [gutvol-d] Proposal: creation of new mailing list for
> PG-related format and process debate
> 
> 
> >>The problem is this: bowerbird is probably one of the most proficient
> >>and accomplished trolls I've run across in 15+ years on the Internet.
> >
> >And, that Greg is too nice to ban him.  Or create gutvol-bb and ban him
> >from every other gut list.
> >
> >And, as you noted, that lots of people who should (IMHO) know better
> >continue to reply to him.  I periodically check the [folder name censored]
> >where my filter dumps his posts; I've yet to encounter a reason to disable
> >the filter.
> >--
> >
> >Cheers,
> >
> >Scott S. Lawton
> >http://Classicosm.com/ - classic books
From hyphen at hyphenologist.co.uk  Sun Oct 22 01:40:38 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Sun Oct 22 01:40:52 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <20061022064908.GA6749@pglaf.org>
References: <001d01c6f503$9a85e110$0300a8c0@blackbox>
	<20061022064908.GA6749@pglaf.org>
Message-ID: <aabmj2lkn3g9c9c8tdgdvp742mik1gpvjb@4ax.com>

On Sat, 21 Oct 2006 23:49:08 -0700,  Greg Newby <gbnewby@pglaf.org> wrote:


|I encourage people to take control of their own mailboxes.  If you don't
|like reading postings from someone, filter them.  If you don't know how
|to filter people in the email program you use, ask here and we'll help.
|Many email programs can filter entire threads (by Subject line), but
|filtering an individual's email is even easier.

Agreed

|I would prefer that Mailman (which otherwise is a capable and wonderful
|mailing list manager) offered subscribers the option to elect not to see
|messages from particular addresses, but that's not a currently available
|feature.  Scott's recommendations, below, make a lot of sense to me.

Agent 4 has introduced a fabulous Bayesian filtering system.
IMO worth every penny I spent on it
Just drag BBs posts to the junk folder and they will always end up there.
Alternatively just set up a filter to delete BB's posts.

-- 
Dave Fawthrop <dave hyphenologist co uk> For Yorkshire Dialect 
http://www.gutenberg.org/author/John_Hartley
http://www.gutenberg.org/author/F_W_Moorman
19,000 free e-books at Project Gutenberg! http://www.gutenberg.org


From klofstrom at gmail.com  Sun Oct 22 01:44:38 2006
From: klofstrom at gmail.com (Karen Lofstrom)
Date: Sun Oct 22 01:44:41 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <aabmj2lkn3g9c9c8tdgdvp742mik1gpvjb@4ax.com>
References: <001d01c6f503$9a85e110$0300a8c0@blackbox>
	<20061022064908.GA6749@pglaf.org>
	<aabmj2lkn3g9c9c8tdgdvp742mik1gpvjb@4ax.com>
Message-ID: <1e8e65080610220144w22f6dafakead2785217d0d12a@mail.gmail.com>

Thanks for reminder about the filter. I switched to gmail a few months
ago and had never used their filter option. First time for everything.
Extremely easy to use. Bye-bye BB.

--
Karen Lofstrom
From desrod at gnu-designs.com  Sun Oct 22 05:23:43 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Sun Oct 22 05:24:54 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for PG-related
	format and process debate
In-Reply-To: <20061022064908.GA6749@pglaf.org>
References: <20061022064908.GA6749@pglaf.org>
Message-ID: <Pine.LNX.4.64.0610220822040.25195@aphrodite.gnu-designs.com>


> I would prefer that Mailman (which otherwise is a capable and 
> wonderful mailing list manager) offered subscribers the option to 
> elect not to see messages from particular addresses, but that's not 
> a currently available feature.

Wouldn't this ultimately expose email addresses to other users on the 
list? You'd have to have some way to select the user you wanted to 
filter/ignore by, and if that user never put in a "real name" when 
they subscribed to Mailman, the only other identifying information 
would be their email address.

I suspect it wouldn't be a feature in Mailman any time soon more for 
ethical reasons than technical ones.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

From davedoty at hotmail.com  Sun Oct 22 09:33:49 2006
From: davedoty at hotmail.com (Dave Doty)
Date: Sun Oct 22 09:33:51 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for
	PG-related	format and process debate
Message-ID: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>


> From: gbnewby@pglaf.org

> I encourage people to take control of their own mailboxes.  If you don't
> like reading postings from someone, filter them. 

The problem is the high number of people who seem to enjoy arguing with BB.  Even though I banned him years ago, it's still not uncommon that I open the mailbox and find it stuffed full of e-mail quoting him in full and following with extended arguing.  The problem isn't even BB himself, but that this has become the BB Forum, and that debating him seems to take up more space than everything else PG-related.  Other than banning half the list, most of whom have things worth saying in other contexts, how can I take control of my mailbox to deal with this?  Or is it a case of "put up with it or leave?"
_________________________________________________________________
Get the new Windows Live Messenger!
http://get.live.com/messenger/overview
From hyphen at hyphenologist.co.uk  Sun Oct 22 10:34:54 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Sun Oct 22 10:35:05 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for
	PG-related	format and process debate
In-Reply-To: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
References: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
Message-ID: <emanj2da29b168u8gppur6494t71mdovpm@4ax.com>

On Sun, 22 Oct 2006 16:33:49 +0000,  Dave Doty <davedoty@hotmail.com>
wrote:

|
|> From: gbnewby@pglaf.org
|
|> I encourage people to take control of their own mailboxes.  If you don't
|> like reading postings from someone, filter them. 
|
|The problem is the high number of people who seem to enjoy arguing with BB.  
|Even though I banned him years ago, it's still not uncommon that I open the 
|mailbox and find it stuffed full of e-mail quoting him in full and following 
|with extended arguing.  The problem isn't even BB himself, but that this has 
|become the BB Forum, and that debating him seems to take up more space than 
|everything else PG-related.  Other than banning half the list, most of whom 
|have things worth saying in other contexts, how can I take control of my 
|mailbox to deal with this?  Or is it a case of "put up with it or leave?"
<plug>
Agent 4 or was it 3? allows you to ignore Sub Thread, you can get rid of
replies to BBs posts and anything downthread.
</plug>

-- 
Dave Fawthrop <dave hyphenologist co uk> For Yorkshire Dialect 
http://www.gutenberg.org/author/John_Hartley
http://www.gutenberg.org/author/F_W_Moorman
19,000 free e-books at Project Gutenberg! http://www.gutenberg.org


From desrod at gnu-designs.com  Sun Oct 22 10:55:32 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Sun Oct 22 10:57:53 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for 
	PG-related format and process debate
In-Reply-To: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
References: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
Message-ID: <1161539732.10407.4.camel@localhost.localdomain>

On Sun, 2006-10-22 at 16:33 +0000, Dave Doty wrote:
> The problem isn't even BB himself, but that this has become the BB
> Forum, and that debating him seems to take up more space than
> everything else PG-related.

I think you've hit the nail on the head. He's co-opted the list, so
everyone has to either respond to him, or keep quiet. If you notice, he
doesn't even let the smallest mention of his name slip, without a
personal reply. 

When he's backed into a corner, he lays blame elsewhere by pointing
fingers to someone else and goes on making splinter threads to keep
everyone off-track, following his fake leads. 

He quite literally *CANNOT* stop replying or ignore someone who mentions
his name or replies to a thread that he has ever responded to. 

-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com

"Erosion of civil liberties... is a threat to national security."
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/4bee0377/attachment.bin
From cannona at fireantproductions.com  Sun Oct 22 11:04:19 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Sun Oct 22 11:06:11 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for
	PG-relatedformat and process debate
References: <20061022064908.GA6749@pglaf.org>
Message-ID: <003b01c6f604$bedec390$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sorry to have misstated the situation.  I must have either misunderstood
your message on the topic or simply remembered wrong.  Either way, my bad.

Sincerely
Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: "Greg Newby" <gbnewby@pglaf.org>
To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org>
Sent: Sunday, October 22, 2006 1:49 AM
Subject: Re: [gutvol-d] Proposal: creation of new mailing list for
PG-relatedformat and process debate


> On Sat, Oct 21, 2006 at 06:25:16AM -0500, Aaron Cannon wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> If I am not mistaken, Greg unbanned him because of a mandate, not by
>> choice.
>> Or at least, that is what I recall he said on the matter when asked, and
>> I
>> have no reason to doubt him.
>>
>> Sincerely
>> Aaron Cannon
>
> BB was never banned, he was simply moderated.  When we moved to new
> mailing list software, I unmoderated him.  At the time of moderation, I
> listened to community pressure, and followed through after making
> specific requests to BB for changes in behavior.  However, ultimately
> individualized moderation is just not consistent with the overall way PG
> operates (see the "About" essays Michael and I worked on last year at
> www.gutenberg.org).
>
> Sidenote:
> Chuck Mattsen has handled moderation for posted and pgww for
> awhile, and needs to give up this duty.  I could use a few volunteers to
> handle moderation of those lists, which have just a few subscribers but
> allow posting (after a moderation decision) by anyone.  We also get a
> lot of spam to the other lists, including -d, glibrary and the
> newsletter lists, requiring moderation action.  Basically this involves
> about 10 instances per day of spending a few seconds with the Mailman
> web-based interface.  Email me if you monitor your email very
> regularly, and might be able to help with moderation.
>
> Back to topic:
> I encourage people to take control of their own mailboxes.  If you don't
> like reading postings from someone, filter them.  If you don't know how
> to filter people in the email program you use, ask here and we'll help.
> Many email programs can filter entire threads (by Subject line), but
> filtering an individual's email is even easier.
>
> I would prefer that Mailman (which otherwise is a capable and wonderful
> mailing list manager) offered subscribers the option to elect not to see
> messages from particular addresses, but that's not a currently available
> feature.  Scott's recommendations, below, make a lot of sense to me.
>  -- Greg
>
>> - --
>> Skype: cannona
>> MSN/Windows Messenger: cannona@hotmail.com (don't send email to the
>> hotmail
>> address.)
>> - ----- Original Message -----
>> From: "Scott Lawton" <scott_bulkmail@productarchitect.com>
>> To: <gutvol-d@lists.pglaf.org>
>> Sent: Friday, October 20, 2006 11:11 PM
>> Subject: Re: [gutvol-d] Proposal: creation of new mailing list for
>> PG-related format and process debate
>>
>>
>> >>The problem is this: bowerbird is probably one of the most proficient
>> >>and accomplished trolls I've run across in 15+ years on the Internet.
>> >
>> >And, that Greg is too nice to ban him.  Or create gutvol-bb and ban him
>> >from every other gut list.
>> >
>> >And, as you noted, that lots of people who should (IMHO) know better
>> >continue to reply to him.  I periodically check the [folder name
>> >censored]
>> >where my filter dumps his posts; I've yet to encounter a reason to
>> >disable
>> >the filter.
>> >--
>> >
>> >Cheers,
>> >
>> >Scott S. Lawton
>> >http://Classicosm.com/ - classic books
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFO7MOI7J99hVZuJcRAhQDAKCH/JGH2zLTDslARHOEIYEFhoqkAACfSnwU
YoGM4Q72ariZqWK1c8Usx00=
=LYIS
-----END PGP SIGNATURE-----

From Bowerbird at aol.com  Sun Oct 22 12:12:22 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sun Oct 22 12:12:33 2006
Subject: =?ISO-8859-1?Q?re:=20[gutvol-d]=20Proposal:=20creation=20of=20ne?=
	=?ISO-8859-1?Q?w=20mailing=20list=20for=A0=20PG-related=20format=20and=20?=
	=?ISO-8859-1?Q?process=20debate?=
Message-ID: <583.d050952.326d1c96@aol.com>

and the ad hominem continues unabated,
with the lynch mob having convinced itself
that it speaks for the whole town.   amusing.

when was the last time there was an interesting
thread here which i didn't bring into existence?
do you think hundreds of lurkers are stupid --
can't tell who stays on-topic and who does not?

i'm _voluntarily_ limiting myself to one post a day
from now on out -- until/unless someone abuses
my self-restraint -- so strain on our e-mailboxes
downstream is clearly attributed to my detractors,
who don't seem capable of talking about anything
except me.   what sad pathetic lives they must have
if their attempts to bully me are the best part of it.

it's ironic, because i _came_ here to talk to michael,
but he doesn't even hang around here any more...

at any rate, this is sunday's post.   sorry for the detour.
tomorrow i'll be back with some perl code that shows
some things p.g. can accomplish with plain-text files.
will my detractors add to this "open-source" effort?
we'll see.   but if i were you, i wouldn't hold my breath.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/52f77487/attachment.html
From scott_bulkmail at productarchitect.com  Sun Oct 22 15:45:41 2006
From: scott_bulkmail at productarchitect.com (Scott Lawton)
Date: Sun Oct 22 15:56:10 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for 
	PG-related	format and process debate
In-Reply-To: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
References: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
Message-ID: <p0611042ec161a0c986aa@[192.168.0.52]>

> > I encourage people to take control of their own mailboxes.  If you don't
>> like reading postings from someone, filter them.
>
>The problem is the high number of people who seem to enjoy arguing with BB.  Even though I banned him years ago, it's still not uncommon that I open the mailbox and find it stuffed full of e-mail quoting him in full and following with extended arguing.

Many (though not all) of these replies include his name in the quoted portion, so those are also easy to filter.

I do think that if "the usual suspects" filtered his posts and thus didn't reply, the noise level would go way down.

That's an improvement over the current situation, but still has a potential drawback.  Even if nearly everyone stopped taking the bait, bb could still undermine the list by continuing to post.  It would be into a vacuum for those of us who filter, but not for everyone -- e.g. not for new folks on the list, and not for misc. people who come across the list archives.  To someone who doesn't know the history, it looks downright rude that a whole bunch of posts have no replies.

The general (though not universal) sentiment seems to be that bb is an unwelcome guest.  Posting to the list is not a right, e.g. outright spam is certainly not allowed.  So, I think the community would be better off by a ban.  Anyone can go back thru the list archives to see why that step was taken (even if in the end some don't agree with it).

And, as noted, there's no harm in creating a brand new list for bb.  Anyone who values the discussion can go there.  Though IMHO PG shouldn't feel at all obligated to do so; there's no shortage of places where bb can host his own list.
-- 

Cheers,

Scott S. Lawton
http://Classicosm.com/ - classic books
From arnold.villeneuve at cirilab.com  Sun Oct 22 17:15:56 2006
From: arnold.villeneuve at cirilab.com (Arnold Villeneuve)
Date: Sun Oct 22 17:23:21 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
Message-ID: <006401c6f638$6985dda0$6501a8c0@TRIAGE1>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 11583 bytes
Desc: not available
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/5cfdc3c8/attachment-0001.jpe
From joshua at hutchinson.net  Sun Oct 22 17:37:21 2006
From: joshua at hutchinson.net (joshua@hutchinson.net)
Date: Sun Oct 22 17:37:34 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
Message-ID: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net>

Well, two things.

1 - I have no idea what a knowledge map is and why it would be 
useful.  Looking at the knowledge map for Mark Twain didn't explain 
what I was looking at (it seemed like a random collection of quotes 
with no discernable organization).  A quick Google search gave a bunch 
of sites on Knowledge Maps but it still means nothing to me (all the 
sites I saw talked about orgnazing information, but not HOW it was 
organizing it).

2 - It is complete gibberish in Firefox.  I only saw anything useful 
by using Internet Explorer.  Since I don't use IE as my normal browser, 
I would have normally ignored any link that brought me to your site.  I 
strongly suggest you fix that immediately.


Josh

----Original Message----<br>
From: arnold.villeneuve@cirilab.com<br>
Date: Oct 22, 2006 20:15 <br>
To: <gutvol-d@lists.pglaf.org><br>
Subj: [gutvol-d] Knowledge Maps of Gutenberg Collections<br>
<br>


<comment>


<comment>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{font-family:Arial;
	color:windowtext;}
@page Section1
	{size:612.0pt 792.0pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
	{page:Section1;}
-->
</comment>

</comment>


<div class=Section1>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>Hello All</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>Cirilab Inc is a new company within The 
Gutenberg
Project area. We are just beginning to see how our technology can 
leverage the
vast Gutenberg warehouse of public domain books and writings. 
</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>What do we do? Cirilab creates Knowledge 
Maps of a
collection of books / documents and Knowledge Views of individual 
books /
documents. </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>What is our goal? Cirilab wants to create a 
Knowledge
Map of the Top 100 Authors by download on Gutenberg as a first phase 
of our
project. </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>What do we want? We really want to have 
input from
Gutenberg Volunteers regarding our Knowledge Maps. We would really 
like the
Gutenberg Volunteers to shape the development of Knowledge Maps of 
Authors
works that are available on the Gutenberg website.  </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>Here are a few examples of some of the work 
we have
done with The Gutenberg Project so far. Please remember that these are 
just
examples and that they are in early development. We produced them so 
that you
would have something to evaluate.  </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>                           
</span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'><a
href="http://www.cirilab.
com/TSMAP/Cirilab_Library/Literature/Twain/index.htm"
title="http://www.cirilab.
com/TSMAP/Cirilab_Library/Literature/Twain/index.htm">http://www.
cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index.
htm</a></span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'><a
href="http://www.cirilab.
com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index.htm"
title="http://www.cirilab.
com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index.htm"
>http://www.cirilab.
com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index.
htm</a></span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'><a
href="http://www.cirilab.
com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index.htm"
title="http://www.cirilab.
com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index.htm">http:
//www.cirilab.
com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index.
htm</a></span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>As per The Gutenberg Project, 20% of the 
profits
generated from ads within the Knowledge Maps will be donated to the 
cause,
which is part of our goal. We would be considered under the Partners,
Affiliates, and Resources section of Gutenberg&#8217;s website. 
Eventually, we
would like to get to a place where Gutenberg volunteers are satisfied 
with our
Knowledge Maps so that they can be listed with each author we have 
done one
for. </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'><img border=0 width=344 height=173
src="cid:image001.jpg@01C6F616.E1FAA0D0"></span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'>I look forward to hearing from you. We 
really want
the most important people at The Gutenberg Project, Volunteers, to be 
driving
the direction of this effort. </span></font></p>

<p class=MsoNormal><font size=2 face=Arial><span lang=EN-CA 
style='font-size:
10.0pt;font-family:Arial'> </span></font></p>

<div>

<p class=MsoNormal><font size=2 color=navy face=Arial><span lang=EN-CA
  style='font-size:10.0pt;font-family:Arial;color:
navy'>Arnold</span></font></u1:place></u1:City><font
size=2 color=navy face=Arial><span lang=EN-CA style='font-size:10.0pt;
font-family:Arial;color:navy'> Villeneuve</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span lang=EN-CA
style='font-size:10.0pt;font-family:Arial;color:navy'>Vice 
President</span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span lang=EN-CA
style='font-size:10.0pt;font-family:Arial;color:navy'><a
href="http://www.cirilab.com/">www.cirilab.com</a> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span lang=EN-CA
style='font-size:10.0pt;font-family:Arial;color:navy'><a
href="http://knowledgeuser.typepad.com/">http://knowledgeuser.typepad.
com</a> </span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span lang=EN-CA
style='font-size:10.0pt;font-family:Arial;color:navy'>613-833-
0984</span></font></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span lang=EN-
CA
style='font-size:12.0pt'> </span></font></p>

</div>


<br>
</blockquote<br>

From donovan at abs.net  Sun Oct 22 17:41:09 2006
From: donovan at abs.net (D Garcia)
Date: Sun Oct 22 18:17:40 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
In-Reply-To: <006401c6f638$6985dda0$6501a8c0@TRIAGE1>
References: <006401c6f638$6985dda0$6501a8c0@TRIAGE1>
Message-ID: <200610222041.10067.donovan@abs.net>

On Sunday 22 October 2006 08:15 pm, Arnold Villeneuve wrote:
> Cirilab Inc is a new company within The Gutenberg Project area. We are just
> beginning to see how our technology can leverage the vast Gutenberg
> warehouse of public domain books and writings.

Hey, now that's refreshing! Normally, when one is attempting to promote a 
commercial venture towards a potential partner, *working* examples are 
given. :)

Now, for bonus points: Where is The Gutenberg Project Area in relation to Area 
51?
From scott_bulkmail at productarchitect.com  Sun Oct 22 18:02:03 2006
From: scott_bulkmail at productarchitect.com (Scott Lawton)
Date: Sun Oct 22 18:29:51 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
In-Reply-To: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net>
References: <32217577.1161563841846.JavaMail.?@fh1037.dia.cp.net>
Message-ID: <p06110431c161c4b6f78b@[192.168.0.52]>

>2 - It is complete gibberish in Firefox.

To Cirilab: you may also want to run the pages thru http://validator.w3.org/

-- 

Cheers,

Scott S. Lawton
http://Classicosm.com/ - classic books
From cannona at fireantproductions.com  Sun Oct 22 18:37:38 2006
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Sun Oct 22 18:37:59 2006
Subject: [gutvol-d] Fw: Gutenberg Republisher Update from Cirilab 
Message-ID: <002801c6f643$daa50f50$0300a8c0@blackbox>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This is a message that was sent to cd@pglaf.org from the same individual.  I
don't know if this sheds any more light on the subject or not.

Anyway, I checked wikipedia and there is no article on knowledge maps, so I
don't know either.


Sincerely
Aaron Cannon


- --
Skype: cannona
MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail
address.)
- ----- Original Message -----
From: "Arnold Villeneuve" <arnold.villeneuve@cirilab.com>
To: <arnold.villeneuve@cirilab.com>; <help@pglaf.org>; <errata@pglaf.org>;
<catalog@pglaf.org>; <copyright@pglaf.org>; <cd@pglaf.org>;
<hart@pobox.com>; <gbnewby@pglaf.org>
Sent: Saturday, October 14, 2006 1:53 PM
Subject: RE: Gutenberg Republisher Update from Cirilab


> Hello Again
>
>
>
> I forgot to mention that Cirilab Inc would like to have Gutenberg
> participate in its Affiliate Program so that when people purchase our
> software as a result of a link from the Gutenberg website or a Gutenberg
> link within one of the Knowledge Maps created from its Public Domain
> documents. For each purchase we would pay Gutenberg 20% of the purchase
> price.
>
>
>
> Can someone please let me know how we would go about this.
>
>
>
> We would also like to include the Cirilab Gutenberg Library of Knowledge
> Maps on the Gutenberg Affiliate page.
>
>
>
> Arnold Villeneuve
>
> Vice President
>
> www.cirilab.com <http://www.cirilab.com/>
>
> http://knowledgeuser.typepad.com <http://knowledgeuser.typepad.com/>
>
> 613-833-0984
>
>  _____
>
> From: Arnold Villeneuve [mailto:arnold.villeneuve@cirilab.com]
> Sent: October 14, 2006 11:03 AM
> To: 'arnold.villeneuve@cirilab.com'; 'help@pglaf.org'; 'errata@pglaf.org';
> 'catalog@pglaf.org'; 'copyright@pglaf.org'; 'cd@pglaf.org';
> 'hart@pobox.com'; 'gbnewby@pglaf.org'
> Subject: Gutenberg Republisher Update from Cirilab
>
>
>
> Hello
>
>
>
> Here is an example of what we can do with Gutenberg Public Domain content.
> This example is of the 13 works of Winston Churchill.
>
>
>
> http://www.cirilab.com/TSMap/Cirilab_Library/Literature/winston_churchill/in
> dex.htm
>
>
>
> The Gutenberg logo and link is on every Knowledge Map page and
> additionally
> on every individual Knowledge View of a book. The entire document is the
> original Gutenberg document.
>
>
>
> We have also created a link on Wikipedia to the Knowledge Map.
>
>
>
>
>
>
>
>
>
> I have two questions for the Gutenberg people at this time:
>
>
>
> 1. Is the republishing of the Gutenberg public domain documents within
> the Knowledge Map acceptable to Gutenberg?
> 2. How can Cirilab create and publish a Knowledge Map right on the
> Gutenberg web site for the Top 100 authors to begin with? In other words,
> when someone looks at a specific author's collection of works on
> Gutenberg,
> we would like to have a Knowledge Map link of their work on that page so
> the
> reader can navigate the collection thematically.
>
>
>
> Please let us know what you think of the Winston Churchill Knowledge Map.
>
>
>
> Arnold Villeneuve
>
> Vice President
>
> www.cirilab.com <http://www.cirilab.com/>
>
> http://knowledgeuser.typepad.com <http://knowledgeuser.typepad.com/>
>
> 613-833-0984
>
>  _____
>
> From: Arnold Villeneuve [mailto:arnold.villeneuve@cirilab.com]
> Sent: October 7, 2006 9:55 PM
> To: 'help@pglaf.org'; 'errata@pglaf.org'; 'catalog@pglaf.org';
> 'copyright@pglaf.org'; 'cd@pglaf.org'; 'hart@pobox.com';
> 'gbnewby@pglaf.org'
> Subject: Gutenberg Republisher Request
>
>
>
> Hello
>
>
>
> I'm not really sure who I should be making this request to so I'm writing
> to
> all of you in the hopes that someone will be able to point me in the right
> direction.
>
>
>
> Cirilab provides Information Triage technology that allows people to
> review
> great volumes of data more quickly and more precisely. We are now creating
> a
> library of Public Domain content and want to enter discussions with The
> Gutenberg Project in order to properly access the Public Domain documents
> you provide while ensure we adhere to your republication licensing
> requirements.
>
>
>
> We are particularly interested in establishing a protocol whereby we can
> create Cirilab Knowledge Maps of each author that The Gutenberg Project
> has
> available and do it in an automated way in order to make the process more
> efficient and maintainable as The Gutenberg Project updates its content.
>
>
>
> Can the appropriate person at The Gutenberg Project please contact me
> directly to begin these discussions.
>
>
>
> If we can make this work, Cirilab will ensure that a portion of any
> revenue
> we generate is donated to The Gutenberg Project as part of the partnership
> arrangement.
>
>
>
> We look forward to hearing from you soon.
>
>
>
> Arnold Villeneuve
>
> Vice President
>
> www.cirilab.com <http://www.cirilab.com/>
>
> http://knowledgeuser.typepad.com <http://knowledgeuser.typepad.com/>
>
> 613-833-0984
>
>
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959
Comment: Key available from all major key servers.

iD8DBQFFPBzwI7J99hVZuJcRAn66AJoCy778jNSPd4xIGeq74Ak3sALUSwCfZz/w
K9mwTxZnMfd1XEJ8pX5FcLY=
=Q4dr
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 58456 bytes
Desc: not available
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/bcba097e/image001-0001.jpg
From arnold.villeneuve at cirilab.com  Sun Oct 22 18:48:10 2006
From: arnold.villeneuve at cirilab.com (Arnold Villeneuve)
Date: Sun Oct 22 18:48:58 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
In-Reply-To: <c26320b80610221838o79aeee50v907f4184d2d30cf9@mail.gmail.com>
Message-ID: <00b101c6f645$4be51b00$6501a8c0@TRIAGE1>

Hello Onorio 

 
Thank you for taking the time to provide some feedback. I really appreciate
it. We are trying to improve what we do and your comments are important to
us. You are not the first person to request better Firefox or other web
browser support. 

 
Your comments will help me raise the issue within my own company so that I
can ensure that we do eventually provide better support for open source
browsers. 

 
Again, thank you sincerely for your $0.02 cents. It's worth a lot to us. 

 
Arnold Villeneuve

Vice President

www.cirilab.com <http://www.cirilab.com/>  

http://knowledgeuser.typepad.com <http://knowledgeuser.typepad.com/>  

613-833-0984

  _____  

From: catenacci@gmail.com [mailto:catenacci@gmail.com] On Behalf Of Onorio
Catenacci
Sent: October 22, 2006 9:39 PM
To: arnold.villeneuve@cirilab.com; Project Gutenberg Volunteer Discussion
Subject: Re: [gutvol-d] Knowledge Maps of Gutenberg Collections

 
On 10/22/06, Arnold Villeneuve <arnold.villeneuve@cirilab.com> wrote:

Hello All

 
Cirilab Inc is a new company within The Gutenberg Project area. We are just
beginning to see how our technology can leverage the vast Gutenberg
warehouse of public domain books and writings. 

 
What do we do? Cirilab creates Knowledge Maps of a collection of books /
documents and Knowledge Views of individual books / documents. 

 
What is our goal? Cirilab wants to create a Knowledge Map of the Top 100
Authors by download on Gutenberg as a first phase of our project. 

 
What do we want? We really want to have input from Gutenberg Volunteers
regarding our Knowledge Maps. We would really like the Gutenberg Volunteers
to shape the development of Knowledge Maps of Authors works that are
available on the Gutenberg website.  

 
Here are a few examples of some of the work we have done with The Gutenberg
Project so far. Please remember that these are just examples and that they
are in early development. We produced them so that you would have something
to evaluate.  

                            
http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index.htm

 
http://www.cirilab.com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doy
le/index.htm

 
http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/in
dex.htm

 
As per The Gutenberg Project, 20% of the profits generated from ads within
the Knowledge Maps will be donated to the cause, which is part of our goal.
We would be considered under the Partners, Affiliates, and Resources section
of Gutenberg's website. Eventually, we would like to get to a place where
Gutenberg volunteers are satisfied with our Knowledge Maps so that they can
be listed with each author we have done one for. 


I know I shouldn't but I automatically tend to think less of webpages that
are only viewable with IE.  Especially considering the sort of person who's
likely to volunteer to help with PG, this is a really glaring omission. 

It seems to me that a lot of PG's work is leveraged on open standards--which
basically seems to be the antithesis of "Best Viewed With IE" webpages. 


Just my $.02.

-- 
Onorio 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/e24241bc/attachment.html
From Catenacci at Ieee.Org  Sun Oct 22 18:50:36 2006
From: Catenacci at Ieee.Org (Onorio Catenacci)
Date: Sun Oct 22 18:50:39 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
In-Reply-To: <00b101c6f645$4be51b00$6501a8c0@TRIAGE1>
References: <c26320b80610221838o79aeee50v907f4184d2d30cf9@mail.gmail.com>
	<00b101c6f645$4be51b00$6501a8c0@TRIAGE1>
Message-ID: <c26320b80610221850n4ad2b8d8jaba23f092149d165@mail.gmail.com>

On 10/22/06, Arnold Villeneuve <arnold.villeneuve@cirilab.com> wrote:
>
>
>
>
> Hello Onorio
>
>
>
> Thank you for taking the time to provide some feedback. I really appreciate
> it. We are trying to improve what we do and your comments are important to
> us. You are not the first person to request better Firefox or other web
> browser support.
>
>
>
> Your comments will help me raise the issue within my own company so that I
> can ensure that we do eventually provide better support for open source
> browsers.
>
>
>
> Again, thank you sincerely for your $0.02 cents. It's worth a lot to us.
>

Don't think of it as supporting open source browsers.  Think of it as
supporting web standards.

-- 
Onorio
From Catenacci at Ieee.Org  Sun Oct 22 18:38:32 2006
From: Catenacci at Ieee.Org (Onorio Catenacci)
Date: Mon Oct 23 00:56:38 2006
Subject: [gutvol-d] Knowledge Maps of Gutenberg Collections
In-Reply-To: <006401c6f638$6985dda0$6501a8c0@TRIAGE1>
References: <006401c6f638$6985dda0$6501a8c0@TRIAGE1>
Message-ID: <c26320b80610221838o79aeee50v907f4184d2d30cf9@mail.gmail.com>

On 10/22/06, Arnold Villeneuve <arnold.villeneuve@cirilab.com> wrote:
>
>  Hello All
>
>
>
> Cirilab Inc is a new company within The Gutenberg Project area. We are
> just beginning to see how our technology can leverage the vast Gutenberg
> warehouse of public domain books and writings.
>
>
>
> What do we do? Cirilab creates Knowledge Maps of a collection of books /
> documents and Knowledge Views of individual books / documents.
>
>
>
> What is our goal? Cirilab wants to create a Knowledge Map of the Top 100
> Authors by download on Gutenberg as a first phase of our project.
>
>
>
> What do we want? We really want to have input from Gutenberg Volunteers
> regarding our Knowledge Maps. We would really like the Gutenberg Volunteers
> to shape the development of Knowledge Maps of Authors works that are
> available on the Gutenberg website.
>
>
>
> Here are a few examples of some of the work we have done with The
> Gutenberg Project so far. Please remember that these are just examples and
> that they are in early development. We produced them so that you would have
> something to evaluate.
>
>
>
> http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Twain/index.htm
>
>
>
>
> http://www.cirilab.com/TSMap/Cirilab_Library/Literature/Sir_Arthur_Conan_Doyle/index.htm
>
>
>
>
> http://www.cirilab.com/TSMAP/Cirilab_Library/Literature/Winston_Churchill/index.htm
>
>
>
> As per The Gutenberg Project, 20% of the profits generated from ads within
> the Knowledge Maps will be donated to the cause, which is part of our goal.
> We would be considered under the Partners, Affiliates, and Resources section
> of Gutenberg's website. Eventually, we would like to get to a place where
> Gutenberg volunteers are satisfied with our Knowledge Maps so that they can
> be listed with each author we have done one for.
>


I know I shouldn't but I automatically tend to think less of webpages that
are only viewable with IE.  Especially considering the sort of person who's
likely to volunteer to help with PG, this is a really glaring omission.

It seems to me that a lot of PG's work is leveraged on open standards--which
basically seems to be the antithesis of "Best Viewed With IE" webpages.


Just my $.02.

-- 
Onorio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061022/34f96747/attachment.html
From schultzk at uni-trier.de  Mon Oct 23 01:02:30 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Mon Oct 23 01:02:36 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <c56.49eae21.326a5848@aol.com>
References: <c56.49eae21.326a5848@aol.com>
Message-ID: <F7CA19C3-3815-4D74-AD6F-6901BD60D7F5@uni-trier.de>

Hi There,

	
Am 20.10.2006 um 18:50 schrieb Bowerbird@aol.com:

> keith said:
> >   Do to its nature multi-line parsing or splitting is not quite  
> that easy
>
> maybe.  but we'll make it work.         :+)
>
>
> >   split is nice. But you want to be doing parsing
>
> on a personal note, any time i call what i'm doing "parsing",
> it goes badly.  but as soon as i call it by _another_ name,
> the same thing with the same code, it starts working better.
> so i've grown allergic to that word, and i almost never use  
> it.      :+)
>
	As you mentioned below. You want something quick and dirty.
	Which will get 80% of the way. Just like word for word translation.

> however, i assume that you're talking about "parsing" in the
> "let's parse the dom tree" sense.  (no, i don't even know what
> that means, so i might well have misused it, which would be
> poetic in its own way.)
>
> that kind of "parsing" would make our code very complicated.
>
> and in the same way that i don't like my format to be complex,
> i don't like my programs to be complex.  so i make them simple.
>
> and what i'm showing people here is how much mileage can be
> obtained out of the simple combination of a simple format and
> some simple programs.  that's the whole purpose of this exercise.
>
> so just stick with me for a little bit on "split", and see some  
> tricks.
>
> (and, just to be clear, although you might think this is related to
> z.m.l., and thus can be swiftly relegated to the "i don't care" pile,
> the truth of the matter is that since virtually all of the books in
> the p.g. library have a plain-ascii representative, one that is close
> to z.m.l. format and perhaps even exact, the code that i'm showing
> here could also be used to great effect on the library as it stands.
> there are a lot of neat features that could be offered with very  
> little
> work or trouble by using the simple code routines i'll reveal here.
> just as an example, how about a simple script that would give us
> a list of the section-headers for every book in the entire library?
> i don't know about you, but i think this "super table of contents"
> for the entire library would be very cool, and likely quite useful.
> and within a week or two of these little daily lessons, we'll have  
> it.)
>
>
> >   Just for the fun of it your script is incomplete.
>
> good observation, keith.  now tell us why, and complete it...      ;+)

	I am to lazy to write the code but you simply forgot to close a few  
html- tags
	no biggy. ;-))

	As To Marcello suggestion about the file name one could write the  
case to check for the
	filename or make your code a procedure and write wrappers for a by  
case usage. Or..
	Nahhh to complicated.

		reagrds
			Keith.

P.S. Do I see it right where in Perl 101 why we try to top each other  
with the fastest and shortest
       code.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061023/12ae197b/attachment-0001.html
From schultzk at uni-trier.de  Mon Oct 23 01:14:48 2006
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Mon Oct 23 01:14:53 2006
Subject: [gutvol-d] here's the perl code for babelfish assignment 01
In-Reply-To: <4539088B.2070009@perathoner.de>
References: <380.f7cf57f.326a50ff@aol.com> <4539088B.2070009@perathoner.de>
Message-ID: <844C5014-F134-4787-9D67-B9ED5E980AE6@uni-trier.de>

Hi There,


Am 20.10.2006 um 19:34 schrieb Marcello Perathoner:

> Bowerbird@aol.com wrote:
>
>> other times, when the script is sitting on your website and to be  
>> called,
>> you'll want it to read the parameters as passed from the calling  
>> script.
>> so, for instance, if we were to pass the filename starting in  
>> column 12:
>>>    read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
>>>    $thefilename=substr($buffer,12);
>
> You are just cutting and pasting out of some perl cgi tutorial. You
> don't have the least idea what is going on.
>
	Do you know what you are doing ?? It always depends on how the
	code is called GET, POST or in the URL or you have stuff it in a
	cookie or even a browser variable.

	I guess we are back to Programming 101 and nit picking.


	Keith. 
From lee at novomail.net  Mon Oct 23 09:45:58 2006
From: lee at novomail.net (Lee Passey)
Date: Mon Oct 23 09:44:26 2006
Subject: [gutvol-d] Proposal: creation of new mailing list for	PG-related
	format and process debate
In-Reply-To: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
References: <BAY105-W5A14A77A33C4CFF0CCC0ADF030@phx.gbl>
Message-ID: <453CF1C6.9040401@novomail.net>

Dave Doty wrote:

>> From: gbnewby@pglaf.org
> 
>> I encourage people to take control of their own mailboxes.  If you
>> don't like reading postings from someone, filter them.
> 
> The problem is the high number of people who seem to enjoy arguing
> with BB.  Even though I banned him years ago, it's still not uncommon
> that I open the mailbox and find it stuffed full of e-mail quoting
> him in full and following with extended arguing.  The problem isn't
> even BB himself, but that this has become the BB Forum, and that
> debating him seems to take up more space than everything else
> PG-related.  Other than banning half the list, most of whom have
> things worth saying in other contexts, how can I take control of my
> mailbox to deal with this?  Or is it a case of "put up with it or
> leave?"

By all objective measures, even at its busiest gutvol-d is a relatively 
low volume list. It is also (for me, at any rate) a relatively low 
priority list. Thus, what I have done is set a filter to automatically 
route /all/ traffic from gutvol-d into a gutvol-d folder. That way the 
posts do not disrupt my daily work-flow and I can choose the time to 
look at them; thus segregated, I can fairly quickly determine which 
messages deserve attention, and which can be consigned to the bit-bucket.

-- 
Nothing of significance below this line.

From sly at victoria.tc.ca  Mon Oct 23 09:55:06 2006
From: sly at victoria.tc.ca (Andrew Sly)
Date: Mon Oct 23 09:55:12 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
Message-ID: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>


It might be of interest to some here to take a look at the paper
"Limits of self-organization: Peer production and laws of quality"
by Paul Duguid.

http://www.firstmonday.org/issues/issue11_10/duguid/index.html

It contains some criticism of Project Gutenberg, particularly
of PG#1079, Tristam Shandy.

Andrew
From prosfilaes at gmail.com  Mon Oct 23 11:42:50 2006
From: prosfilaes at gmail.com (David Starner)
Date: Mon Oct 23 11:42:55 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
In-Reply-To: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
References: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
Message-ID: <6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com>

On 10/23/06, Andrew Sly <sly@victoria.tc.ca> wrote:
>
> It might be of interest to some here to take a look at the paper
> "Limits of self-organization: Peer production and laws of quality"
> by Paul Duguid.
>
> http://www.firstmonday.org/issues/issue11_10/duguid/index.html
>
> It contains some criticism of Project Gutenberg, particularly
> of PG#1079, Tristam Shandy.

It frustrates me that people keep nitpicking our translations. Yes,
many translations aren't the greatest in the world. But translations
stand independent of the original; how is someone supposed to really
understand the note at the front of the Penguin edition without a copy
of the earlier translation to compare it to?

Oh, yeah, and apparently his computer can't read Latin-1 properly, and
he blames us.
From lee at novomail.net  Mon Oct 23 14:32:13 2006
From: lee at novomail.net (Lee Passey)
Date: Mon Oct 23 14:30:46 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
In-Reply-To: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
References: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
Message-ID: <453D34DD.2030703@novomail.net>

Andrew Sly wrote:
> It might be of interest to some here to take a look at the paper
> "Limits of self-organization: Peer production and laws of quality"
> by Paul Duguid.
> 
> http://www.firstmonday.org/issues/issue11_10/duguid/index.html
> 
> It contains some criticism of Project Gutenberg, particularly
> of PG#1079, Tristam Shandy.
> 
> Andrew

Thanks for the highly interesting link. I have a few quibbles with the 
analysis, but it was very enlightening nonetheless. The biggest problem 
with the PG analysis, in my mind, is that while he identified some real 
and serious concerns, there was no suggestion of systemic changes which 
could be made to resolve those concerns.

-- 
Nothing of significance below this line.

From ian at babcockbrown.com  Mon Oct 23 14:56:31 2006
From: ian at babcockbrown.com (Ian Stoba)
Date: Mon Oct 23 15:45:58 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
In-Reply-To: <6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com>
References: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
	<6d99d1fd0610231142x76356e68n7c232b913a774570@mail.gmail.com>
Message-ID: <6F378C93-F5A8-4AB7-98F0-BFD295178B3A@babcockbrown.com>


On Oct 23, 2006, at 11:42 AM, David Starner wrote:

> On 10/23/06, Andrew Sly <sly@victoria.tc.ca> wrote:
>>
>> It might be of interest to some here to take a look at the paper
>> "Limits of self-organization: Peer production and laws of quality"
>> by Paul Duguid.
>>
>> http://www.firstmonday.org/issues/issue11_10/duguid/index.html
>>
>> It contains some criticism of Project Gutenberg, particularly
>> of PG#1079, Tristam Shandy.
>
> It frustrates me that people keep nitpicking our translations. Yes,
> many translations aren't the greatest in the world. But translations
> stand independent of the original; how is someone supposed to really
> understand the note at the front of the Penguin edition without a copy
> of the earlier translation to compare it to?
>
> Oh, yeah, and apparently his computer can't read Latin-1 properly, and
> he blames us.

I thought the article was interesting and it raised two valid points,  
neither of which was really central to the paper's main question  
about the portability of the open source model to projects other than  
software development.

First: It is very difficult to create an accurate e-book for a  
printed book in which typography and design are integral to the  
author's creation. This problem is not unique to PG, by any means,  
and Duguid is correct to point out that editorial decisions are made  
in the process of creating an e-book. Again, these are both artifacts  
of the conversion from printed page to binary bits and are true for  
all e-book efforts, not just PG. The part which does have the most  
direct bearing on PG is the fact that some books are extremely  
difficult to present accurately in ASCII text, and Tristam Shandy  
certainly falls into this category. I still found myself wondering:  
is there some system of organization that could have done a better  
job rendering this complex work in ASCII? I think the shortcomings of  
this e-book are much more due to the inherent difficulty of rendering  
the text than they are to anything involving the structure of the PG  
volunteer group.

Second: PG ultimately aspires to being a repository of every public  
domain work on the planet. By definition that includes multiple  
editions of different works. The question of which edition gets  
digitized first depends on a number of factors. Duguid is correct in  
identifying that both newer editions (which may be encumbered with  
copyrights for introductions and other new content) and older  
editions (which may be too valuable or delicate to scan, or may  
simply be unavailable) may not be practical. This leaves a lot of  
Victorian era editions of many works available as source materials.  
In some cases, the editions available to PG to scan may have been  
Bowdlerized and may no longer reflect the author's original intent.  
The point is valid, but I don't see anything obvious that could be  
changed to make the situation better. The Million Book Project and  
Google both seem to face similar challenges in their efforts to scan  
public domain works.

So ultimately, like everything involving humans, there are things in  
PG e-books that are imperfect and Duguid has pointed out two of them.  
Unfortunately,  it does not seem to me that there are any practical  
structural or procedural changes that could be made that would  
address these issues. Perhaps high resolution page scans from a first  
edition are the best way to read an e-book of Tristam Shandy, but  
that is not a practical option for most readers. On balance, is the  
current (imperfect) version of the e-book better than not having a  
free e-book of Tristam Shandy at all? I think it is, but I would be  
interested to hear differing opinions.

--Ian


>


This email message may contain information that is confidential and proprietary to Babcock & Brown or a third party. If you are not the intended recipient, please contact the sender and destroy the original and any copies of the original message. Babcock & Brown takes measures to protect the content of its communications. However, Babcock & Brown cannot guarantee that email messages will not be intercepted by third parties or that email messages will be free of errors or viruses. 

If you do not wish to receive any further e-mail from Babcock & Brown, please send an email to opt-out@babcockbrown.com.
From sly at victoria.tc.ca  Mon Oct 23 22:49:45 2006
From: sly at victoria.tc.ca (Andrew Sly)
Date: Mon Oct 23 22:49:49 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
In-Reply-To: <453D34DD.2030703@novomail.net>
References: <Pine.GSO.4.58.0610230948580.12835@vtn1.victoria.tc.ca>
	<453D34DD.2030703@novomail.net>
Message-ID: <Pine.GSO.4.58.0610232249100.18389@vtn1.victoria.tc.ca>


Another point is that, as someone very involved with PG, I know that
something with a PG number as low as 1079 is more likely to have certain
inconsistencies or problems than a more recent release might have.
However, to someone else, such as Paul Duguid, it is taken as being
being representative of the whole collection.

To put it in perspective, this text is an example of part of a process
(which is still ongoing) of volunteers discovering what works over time.
(An issue in this case putting footnotes inline, surrounded by brackets.)
If we were to imagine an alternate reality where PG was a top-down
organization, attempting to enforce sets of strict rules, it could
easily be the case that this would have been put aside and not posted
yet (nine years later). Is it possible, we could still be having arguments
about the "proper" way to represent a blank, black page?

Re: Ian's comment about challenges of representing typography and design
elements in digital transcriptions. Yes, as you say, this is a challenge
for any group, not just PG. Some people have tried to preserve information
relating to the digitization process. I've adapted dozens of texts for PG
from other online sources, and I am no longer surprised to find examples
where a text is very meticulously labelled with bibliographic data and so
forth, (which makes it appear very scholarly and acceptable); only to
examine it more closely and find out it is not actually from the source
which it claims, or that the preparer has put much effort into documenting
facts like smudged page numbers--while neglecting to fix many ocr scannos,
etc.


Andrew

From radicks at bellsouth.net  Tue Oct 24 08:02:58 2006
From: radicks at bellsouth.net (Dick Adicks)
Date: Tue Oct 24 08:03:04 2006
Subject: [gutvol-d] Paper which mentions Project Gutenberg
In-Reply-To: <Pine.GSO.4.58.0610232249100.18389@vtn1.victoria.tc.ca>
Message-ID: <C163A362.6BDF%radicks@bellsouth.net>

Andrew, it's worth noting that the critic adds the following qualification:

"I do not want the arguments above to suggest that Gracenote is worthless or
Project Gutenberg useless. Far from it. Both are immensely useful.
Nonetheless, both suffer from problems of quality that are not addressed by
what I have called the laws of quality ? the general faith that popular
sites that are open to improvement iron out problems and continuously
improve. In the case of Gracenote, it may be that only users with minority
tastes suffer and they should be prepared to look after themselves. In the
case of Project Gutenberg, by contrast, the Project does greatest disservice
to those it most seeks to serve, the general reader who may not know enough
about the texts he or she is reading to be able to distinguish nonsense from
complexity, editorial misjudgment from authorial teasing, bowdlerization
from Nordic prudery. In both cases, whether to guide users better or to
improve the system, these limitations need to be recognized."

He acknowledges the "immense usefulness" of PG, but he calls for a more
careful quality control.  Haste makes waste.  His criticism is worth heeding
for a volunteer effort that works _sub specie aeternitatis_.

Dick Adicks


> From: Andrew Sly <sly@victoria.tc.ca>
> Reply-To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org>
> Date: Mon, 23 Oct 2006 22:49:45 -0700 (PDT)
> To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org>
> Subject: Re: [gutvol-d] Paper which mentions Project Gutenberg
> 
> 
> Another point is that, as someone very involved with PG, I know that
> something with a PG number as low as 1079 is more likely to have certain
> inconsistencies or problems than a more recent release might have.
> However, to someone else, such as Paul Duguid, it is taken as being
> being representative of the whole collection.
> 
> To put it in perspective, this text is an example of part of a process
> (which is still ongoing) of volunteers discovering what works over time.
> (An issue in this case putting footnotes inline, surrounded by brackets.)
> If we were to imagine an alternate reality where PG was a top-down
> organization, attempting to enforce sets of strict rules, it could
> easily be the case that this would have been put aside and not posted
> yet (nine years later). Is it possible, we could still be having arguments
> about the "proper" way to represent a blank, black page?
> 
> Re: Ian's comment about challenges of representing typography and design
> elements in digital transcriptions. Yes, as you say, this is a challenge
> for any group, not just PG. Some people have tried to preserve information
> relating to the digitization process. I've adapted dozens of texts for PG
> from other online sources, and I am no longer surprised to find examples
> where a text is very meticulously labelled with bibliographic data and so
> forth, (which makes it appear very scholarly and acceptable); only to
> examine it more closely and find out it is not actually from the source
> which it claims, or that the preparer has put much effort into documenting
> facts like smudged page numbers--while neglecting to fix many ocr scannos,
> etc.
> 
> 
> 
> 
> Andrew
> 
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From sly at victoria.tc.ca  Tue Oct 24 18:01:18 2006
From: sly at victoria.tc.ca (Andrew Sly)
Date: Tue Oct 24 18:01:53 2006
Subject: [gutvol-d] Morse code
Message-ID: <Pine.GSO.4.58.0610241759090.13045@vtn1.victoria.tc.ca>


Ok, here's something to file under "unanticipated
uses of PG texts"...

"A Princess of Mars" converted to Morse Code
http://www.hotpeppersoftware.com/downloads/downloads.html


Andrew
From hyphen at hyphenologist.co.uk  Tue Oct 24 19:27:38 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Tue Oct 24 19:27:54 2006
Subject: [gutvol-d] Morse code
In-Reply-To: <Pine.GSO.4.58.0610241759090.13045@vtn1.victoria.tc.ca>
References: <Pine.GSO.4.58.0610241759090.13045@vtn1.victoria.tc.ca>
Message-ID: <rmitj2lp87m5anq5f6br71d736elog4jlp@4ax.com>

On Tue, 24 Oct 2006 18:01:18 -0700 (PDT),  Andrew Sly <sly@victoria.tc.ca>
wrote:

|
|Ok, here's something to file under "unanticipated
|uses of PG texts"...
|
|"A Princess of Mars" converted to Morse Code
|http://www.hotpeppersoftware.com/downloads/downloads.html

If it does not conform to the W3 standard surely we can not use it ;-)
-- 
Dave Fawthrop <dave hyphenologist co uk> 
"Intelligent Design?" my knees say *not*. 
"Intelligent Design?" my back says *not*.
More like "Incompetent design". Sig (C) Copyright Public Domain

From Bowerbird at aol.com  Fri Oct 27 12:01:09 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Oct 27 12:01:23 2006
Subject: [gutvol-d] the peace and quiet
Message-ID: <bda.6322da5.3273b175@aol.com>

gosh, the peace and quiet here has been so pleasant
that i've been totally reluctant to disturb it, even with
just one post a day, even for our open-source project.

so now i've stored up a credit backlog for several posts.

maybe we'll start up again next monday.   stay tuned...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061027/325678b2/attachment.html
From hyphen at hyphenologist.co.uk  Fri Oct 27 14:11:11 2006
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Fri Oct 27 14:11:24 2006
Subject: [gutvol-d] the peace and quiet
In-Reply-To: <bda.6322da5.3273b175@aol.com>
References: <bda.6322da5.3273b175@aol.com>
Message-ID: <hdt4k29fhbi5qnt4n18hat6m3alk0gd4dc@4ax.com>

On Fri, 27 Oct 2006 15:01:09 EDT,  Bowerbird@aol.com wrote:

|gosh, the peace and quiet here has been so pleasant

Well $?%$?%$&^%$ leave it that way ;-(
-- 
Dave Fawthrop <dave hyphenologist co uk> For Yorkshire Dialect 
http://www.gutenberg.org/author/John_Hartley
http://www.gutenberg.org/author/F_W_Moorman
19,000 free e-books at Project Gutenberg! http://www.gutenberg.org

From Bowerbird at aol.com  Sat Oct 28 20:49:01 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sat Oct 28 20:49:09 2006
Subject: [gutvol-d] the peace and quiet
Message-ID: <513.6734f940.32757ead@aol.com>

dave said:
>   Well $?%$?%$&^%$ leave it that way

please don't say "$?%$?%$&^%$" unless you really mean it...       ;+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061028/8f41a8f8/attachment.html
From Bowerbird at aol.com  Mon Oct 30 13:57:50 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 30 13:58:00 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
Message-ID: <c14.79d1e63.3277cf5e@aol.com>

hi.   this is my one post for 2006-october-30rd.

it's long, so please feel free to read it in chunks.
(there is an "executive summary" at the bottom.)

***

welcome back to "babelfish", our little
"open-source" project here on gutvol-d.

first, big props to jeroen hellingman for
joining in with the "open-source" spirit.

as he announced on the gutvol-p list,
jeroen has created roughly _200_ t.e.i.
e-texts that are now in the p.g. library,
and he's made his various tools available.

since his .tei isn't the same as the "official"
.pgtei, he's put only his .html and .pdf in
the p.g. library so far.   but maybe one day,
there will exist a conversion routine that
morphs jeroen's .tei into the "official" one.

in the meantime, check out his free tools!
 
***

and now to respond to some feedback from
the announcement i made about this project,
including some code that i released (which
is repeated below for your convenience)...

keith said:
>    As you mentioned below. You want something quick and dirty.
>    Which will get 80% of the way. Just like word for word translation.

this code is "proof-of-concept", so the goal is 100% functionality.

and since the main thing i am demonstrating is that it will be
_simple_ to write such code, i must also aim at "code simplicity".
(but since i'm just a beginner with perl, that'll come naturally...)

as long as i get reasonably close on those two -- proof-of-concept
and simplicity of the code -- i'll happily settle for as low as 10% on 
other variables, like speed, size, elegance, code beauty, and so on...

once a piece of code does what i want it to do, i'll move on.

as "open-source" code, i have released it early, and expect that
if other people like its functionality, they'll set out to improve it.

i'm just a beginner in perl, anyway, so it's unlikely i would be able
to smooth the code to professional level anyway, but nonetheless,
given my clear objectives here, there's no reason for me to do it...

my one and only mission now is to demonstrate the viability of
"zen markup language" toward creating a high-powered library
of plain-text files simple enough for a 4th-grader to maintain...

i don't even care if project gutenberg implements these features,
because i will be including them in my mirror of the p.g. library...

i only wish to show here how easy it is put 'em into play, since they
can be realized with just a few lines of code written by a beginner...


>   As To Marcello suggestion about the file name
>    one could write the case to check for the?filename 
>    or make your code a procedure and write wrappers 
>    for a by case usage. Or.. Nahhh to complicated.

listening to marcello right now will only make you confused.
stick with me for right now.   i'll tell you everything you need.


>   I am to lazy to write the code but you simply 
>    forgot to close a few html- tags.  no biggy. ;-)

some of that is intentional -- the "body" and "html" close tags --
because i might want to have my script append something else
to that web-page in some of my experimenting down the line...

any other ones -- like "pre" -- are just because i didn't care;
the browser closes all the tags when it hits page-end anyway.


>    P.S. Do I see it right where in Perl 101 why we try to 
>    top each other with the fastest and shortest code.??

again, that's not my game here, i'm doing proof-of-concept,
but if you wanna play that, go ahead, it can be lots of fun...        :+)

i _am_ looking for _simple_ code, however.   so if you show me
an easier way of accomplishing some task, i might well adopt it.
(except in cases where i'm gonna leverage my way down the line.)

but as you will see, most of my routines are in the neighborhood
of just a couple of lines anyway, so i don't think it gets more simple...

***

keith said:
>    Do you know what you are doing?

i know my code works for me.   that's all i need to know right now.


>    It always depends on how the code is called GET, POST or 
>    in the URL or you have stuff it in a cookie or even a browser variable.

if my code doesn't work for you, or you can see cases where it
won't work under certain circumstances, do please let me know.

but what i'm doing is simple enough that i doubt that will happen.

(in a backchannel after having written this, i learned that keith was
directing those remarks at marcello, not at me.   but as i told him,
i like to ask myself the "do you know what you're doing?" question
on a regular basis.   it's good for grounding and self-improvement.)

***

when i came on this listserve 3 years ago,
it was to tell people that heavy markup of
the project gutenberg library is unnecessary.

it is still unnecessary.

for some people -- people who have made
a significant investment of time and energy,
(and life!), in trying to learn heavy markup --
this is _not_ a welcome message, and they
would like to not have to hear it, sometimes
to the point of keeping _you_ from hearing it.

so they have tried to shout me down.
and they have tried to get me banned.
and they have tried to discredit me.

at any time along the way, i could have
provided enough irrefutable truth that
they would have simply had to back off.

but what fun would that have been?         :+)

so i decided to toy with them instead.

much like a cat plays with a mouse.
a mouse who thinks he's an elephant.

on occasion, i would let them think they 
might have me "cornered", or "vulnerable".

i wanted to see how brazen they would be.

and boy could they be brazen!          :+)

at any rate, the time has come for proof.
and the proof is in the pudding.   eat it...

play-time is over.   we're getting real now.

i expect that the name-calling will escalate,
for a short time anyway.   not long after that,
they'll realize that their cause is hopeless, and
give up.   but _until_ then, they will do whatever
they think they can get away with, to make you
stop listening to me, so you won't see my proof.

but it doesn't matter whether you see it or not,
because this truth has a strength that will win...

when michael hart insisted that project gutenberg
was about _the_words_, and not fancy formatting,
he was exercising a very insightful vision, because
years later -- today -- we can make the formatting
automatic, if we have the words in their right order.

i will now prove that michael hart was right...

***

as i said earlier, my point in organizing our
open-source project is to demonstrate that 
a few simple programming techniques can
give us remarkable power when used on a
simple consistent file-format (like z.m.l.)...

we can certainly create the .html that serves
as a rosetta stone to various e-book formats.
hence the name of this project:   "babelfish"...

to orient you, i created this "top 10 list" of
these simple programming techniques...

1.   read files, from disk or websites.
2.   write files, to disk or webpage.

3.   split and join strings.
4.   search strings for substrings.
5.   get portions of strings.
6.   do replacements in strings.

7.   loops (if/then, for/next, while/wend).

8.   i forget what 8 was for.

9.   collect and pass on user input.
10.   zip/unzip online files.

that's it.

heck, i don't even know how to do #10 myself yet
-- but i assume that it is easy using perl -- and
i might occasionally throw in another technique.
but for the most part, it'll just be these "top 10"...

so if you can do these 10 simple things in _your_
choice of a programming language, then you too
will be able to use the pseudo-code that i give you.

i know that most of you are _not_ programmers, at all.
but do please keep reading, because what's important
here is _not_ the programming per se, but the features
-- the functionality -- so your eyeballs will work fine...

besides, these features are targeted directly at _users_,
so each of you can judge their efficacy and desirability
as well as anyone else.   (i assume you are all readers.)

i promise i won't dwell on the code, i'll just list it out,
so people who wanna run it for themselves can do so.

but from my standpoint, the output is what's important.
so for each fragment of code, i'll give you a web u.r.l. to
load in your browser so you see the results _right_away_.

you don't have to look at the code at all, if you don't want.
just load the u.r.l. underneath it, and look at its _output_...

the lesson of my mission:   a dirt-simple format and
dirt-simple code can yield tremendous functionality.

***

i'm not going to talk very much about the z.m.l. format
in the development of this open-source coding project,
other than to describe what we will need to know about it
to write our routines.   of course, once you observe how the
routines work on a file, you'll obtain a good understanding
for why the "rules of z.m.l." were made exactly as they were.

here is the main .zml file that we'll use for this project:
>    http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml

this is "my antonia", by willa cather.   many people have
digitized this book.   (thanks to jon noring for the scans.)

it would be good if you took a close look at this .zml file.
convince yourself that there are no "tricks" in it, that it is a
plain-and-simple raw-ascii file -- carefully done, certainly,
but nothing more than that.   this .zml file is our _input_...

we'll have various types of _output_ along the way, but if you
want to know the main goal we are shooting for, you should
take a close look at the sequence of files you can hook into here:
>    http://snowy.arsc.alaska.edu/bowerbird/myant/myantp001.html

this particular set of files is a demo for "continuous proofreading"
-- a system where the public does the "final" proofing on a book --
so it puts up the text for each page opposite the scan for that page,
along with a form at the bottom that lets people report corrections.

of course, as you will see, we could also develop _other_ interfaces.
indeed, one of the things you will take away from this demonstration
is that it can be extremely easy to set things up exactly as you like it...
after all, that's one of the promises of open-source, isn't it?

***

for contrast, if you'd like to see e-book versions from jon noring:
>    http://www.openreader.org/myantonia/

or here's an _excellent_ .pdf, a "digital reprint" from jose menendez:
>    http://www.ibiblio.org/ebooks/Cather/index.html
jose has replicated the look of the original book, and has links that
enable the reader to summon the scans for comparison.   awesome!

***

so here we go...

***

to recap, our assignment #1 was to (a) read an e-text in,
then (b) write it out to a webpage.   here's the perl code:

>    #!/usr/bin/perl
>    ####################### read the file in, and write it out
>    ####################### read the file in...
>    $filename="/home2/yoursiteinfohere/public_html/myant/myant.zml";
>    open (inf,"$filename") or print "that file was not available...<p>\n";
>    read (inf,$thebook,2222222); close inf;
>    ####################### ...and write it out...
>    print "content-type: text/html\n\n";
>    print '<!doctype html public "-//w3c//dtd html 4.01 transitional//en">';
>    print '<html lang="en">';
>    print '<head>';
>    print '<meta http-equiv="content-type" content="text/html; 
charset=us-ascii">';
>    print '<title>babelfish!';
>    print '</title>';
>    print '</head>';
>    print '<body>';
>    print '<pre>';
>    print '<hr><br>';
>    ####################### ...and here's the money-shot.
>    print $thebook;
>    #   pseudo-code:   read the file in, and write it back out again

you can see the results of this code by running this script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish01.pl

this e-text is over 400k, so it takes a little while to load, especially
if you're on dialup.   so we'll try and do something about that later...

technically, we've translated the e-text into .html.   so we're done.
(just kidding...)           ;+)

***

assignment #2 is to _split_ the e-text by its pages.

if you examine the file, you'll see that pagebreaks are
indicated by double-curly-brackets.   the name of the
scan of that page is enclosed in those curly-brackets.

right above that line, the last line of each page is its 
pagenumber, enclosed in double-standard-brackets.

in step #1 above, we read the book into a string.

now we'll "chunk" that string into its respective pages,
just by doing a "split" on a pair of open-curly-brackets.
(the "split" command splits the big string into a bunch 
of little ones, by using the "separator" as a split point.)

here's the code, which you can append to the code above
after discarding the last line of code (i.e., "print $thebook;"):

>    ####################### chunk it into pages and output 3
>    @onepage=split("{{",$thebook); foreach $onepage (@onepage) {      
>    $nn++; if ($nn eq "136" or $nn eq "253" or $nn eq "364")
>    {print $onepage; print '<hr><br>'}};
>    #   pseudo-code:   chunk into pages; output 136, 253, and 364.

you can see the results of this new code by running the script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish02.pl

this truly minuscule amount of code chunks the file into pages
(splitting on double-open-curly-brackets that indicate pages),
then runs through each page, and prints three selected ones...

(what are those pages, and why did _those_ pages come out?
let's see you answer those questions, and comment on them.)

this ability to use "split" to separate the entire file into "chunks"
-- while simple to understand and program -- will provide us
a _lot_ of power for handling the text, when we use it wisely...

in particular, this ability to split the file into its respective pages
-- with each chunk being the text that appeared on one page --
is important because it's the very first step in creating the set of
.html pages that i pointed to up above which serves as our "goal":
>    http://snowy.arsc.alaska.edu/bowerbird/myant/myantp001.html

***

so, did you answer the question about what pages we got, and why?

we got pages 111, 222, and 333.   those are the pages i _wanted_.

but the code actually asked for _chunks_ that were 136, 253, and 364.
so how come we got pages 111, 222, and 333?

well, it's because this book has several "unnumbered" pages in it.

these pages include 2 "cover" pages (the cover and an added 
"hot" table of   contents) plus 13 other pages of front-matter,
and some _illustration_ pages sprinkled throughout the book.
(plus those illustrations are repeated at the end of the book.)

the "unnumbered" pages have chunks of text associated with them
(even if only the name of the graphic-file holding that illustration),
so we actually have more "chunks" of text than _numbered_pages_.
which means the two numbering systems are not in sync.

i had to go through some trial-and-error to discover the "chunk"
numbers that i needed, in order to get pages 111, 222, and 333,
as the chunks that i needed to request were 136, 253, and 364...

but of course, we don't want to have to do such "trial-and-error"
every time we want to display a specific page-number of text, so
what we will do is _search_ the text of each page/chunk for the
_pagenumber_ that we want.   you will remember the pagenumber
is enclosed in double-brackets as the last line of each page's text,
so it's rather easy to do a search for it.   when we find the chunks that
contain "[[111]]" and "[[222]]" and "[[333]]", we will spit _those_ out.

so here we go with assignment #3:   output pages 111, 222, and 333.

instead of the 3 lines that we had in the routine for babelfish02.pl,
we'll use this code instead for this new assignment, babelfish03.pl.

>    ####################### output pages 111, 222, and 333
>    @thepage=split("{{",$thebook); foreach $thepage (@thepage) { 
>    $r1=index($thepage,"[[111]]"); $r2=index($thepage,"[[222]]"); 
>    $r3=index($thepage,"[[333]]"); if ($r1 > -1 or $r2 > -1 or $r3 > -1) 
>    {print $thepage; print '<hr><br>'}};
>    #   pseudo-code:   output pages with "[[111]]", "[[222]]", or "[[333]]"

you can see the results of this code by running the next script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish03.pl

of course, the output here looks just like it did for babelfish02.pl;
but we've got a more robust way of creating it now, which is good.

***

so, to display the page we wanted, we learned how search the text.

this naturally introduces us to the idea of searching for _words_,
and displaying the pages that contain the terms we'd specified...

so the next assignment is to write a _word-search_ routine;
let's say the user had specified a search for the term "breeze".

assignment #4:   output all of the pages with the word "breeze".

it's just a simple cut-back of the 5-line routine we just wrote...

>    ####################### output pages containing "breeze"
>    @thepage=split("{{",$thebook); foreach $thepage (@thepage)
>    {$result=index($thepage,"breeze");
>    if ($result > -1) {print $thepage; print '<hr><br>'}};
>    #   pseudo-code:   output all pages that contain the term "breeze".

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish04.pl

wow, that _was_ a breeze, wasn't it?   a lot of power in those 3 lines.

so you're beginning to get the picture.   with two dozen lines of code,
copied out of a _primer_ on perl, we've managed to cobble together
the raw engine to do an electronic search (one of the most powerful
of all the benefits offered by a switch from paper-books to e-books),
and to display one page, so we don't have to load in the whole book.

***

this ability to "split" the string can operate on a very granular level.
we can split the string on _whitespace_ -- spaces, newlines,tabs --
such that every _word_ is treated distinctly.   this means that we can
uniquely identify, by number, each and every word in the entire file.

so our next assignment is to do something along those very lines...

assignment #5:   number each word and output #75319-#75357.

>    ####################### chunk words, output 75319-75357
>    print "<small><small>";
>    @oneword=split(" ",$thebook); foreach $oneword (@oneword) {$nn++; 
>    if ($nn >= "75319" and $nn <= "75357") {print "$nn -- $oneword\n"}};
>    #   pseudo-code:   chunk into words; output 75319-75357.

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish05.pl

as you can see, this split and run through the words happens fast,
especially when we're outputting a mere 303 words.   if we output
all of the words, it's kinda slow, so we'd need to speed it up if we
wanted to put this in front of the public.   but gosh, what power!

not that people want to read a book with just one word per line,
but our ability to be _specific_ in pinning down a certain word
-- i.e., the 75,319th word in this file is "sunflower" -- could be
quite useful if we ever need to do any integrity checks on the file.

the ability to point to this particular sequence of words:
>    75319 -- sunflower
>    75320 -- stalk
>    75321 -- and
>    75322 -- clump
>    75323 -- of
>    75324 -- snow-on-the-mountain,
>    75325 -- drew
>    75326 -- itself
>    75327 -- up
>    75328 -- high
>    75329 -- and
>    75330 -- pointed;
>    75331 -- the
>    75332 -- very
>    75333 -- clods
>    75334 -- and
>    75335 -- fur-
>    75336 -- rows
>    75337 -- in
>    75338 -- the
>    75339 -- fields
>    75340 -- seemed
>    75341 -- to
>    75342 -- stand
>    75343 -- up
>    75344 -- sharply.
>    75345 -- I
>    75346 -- felt
>    75347 -- the
>    75348 -- old
>    75349 -- pull
>    75350 -- of
>    75351 -- the
>    75352 -- earth,
>    75353 -- the
>    75354 -- solemn
>    75355 -- magic
>    75356 -- that
>    75357 -- comes
with such a large degree of specificity is quite fantastic, and might
come in quite handy when we start building our linkage capabilities.

we might have dissent about how "word" is defined -- what with
>    75324 -- snow-on-the-mountain,
or
>    75335 -- fur-
>    75336 -- rows
-- but given that anyone in the world will get this same sequence
when running this same perl program on this same file, i'd think
we can accept this output as is and still feel quite comfortable...

***

a split that is even more useful is the one we can do on _lines_.

so our next assignment, just a quick rewrite, is to do exactly that...

assignment #6:   number each _line_ and output _all_of_them_

>    ####################### number the lines, and output all
>    @oneline=split("\n",$thebook); foreach $oneline (@oneline) {      
>    $nn++; if ($nn >= "0" and $nn <= "99999") {print $oneline}};
>    #   pseudo-code:   chunk into lines; number and output all of them

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish06.pl

an excellent example of the power of a very small amount of code,
it's extremely likely that we'll return to this capability down the line.

indeed, i can guarantee we'll be using this routine more in the future,
so we'll leave any further exposition of its magic for later...

***

so far most of our "splits" have been on the p-book _pages_,
but we can split on other stuff if we _want_, and we just might. 

one of the rules of .zml is that a new section is indicated
by the presence of 4 or more blank lines before its header.

so we might want to search for _headers_ by searching for
(at least) 4 blank lines (i.e., 5 or more newline characters)...

and yes, we _could_ just proceed as we did above for "breeze",
and first split the book up into pages, and then search the text
of each page for 5 consecutive newlines.   sure, that would work.

but we can also split the book -- in the first place -- by using
5 consecutive newlines as the splitter.   what _that_ would do
is split the book up into its _sections_, rather than its pages...
(of course, in most p.g. e-texts, the "sections" are "chapters",
so you will all understand if i use those terms interchangeably.)

and since that will be a useful thing later, let's learn it now.

you might remember i sought help on chunking a string
by doing a split on _multiple-consecutive-line-endings_.

i said:
>    and if someone would tell me how to do a "split" on
>    a sequence of multiple line-endings, that'd be great.
>    i assumed it would be something like this:
>    >    @thesections=split('\n\n\n\n\n',$thebook);
>    but that doesn't appear to be working for me.

marcello came in with an answer.   i guess he couldn't
bring himself to tell you that the command i gave there
actually _is_ a correct specification for doing such a split.
he had to make a little mod that made it _seem_ different.
but the code i wrote there will indeed to the job just fine...

the _reason_why_ "it doesn't appear to be working" is that
that code will split on the _line-feed_ character that linux
(and thus webservers) define as the "newline" character...

but in actuality, the _input-file_ -- the "myant.zml" file --
had _carriage-returns_ as its linebreak characters, since
i made it on a mac, and the mac uses a _carriage-return_
as its indicator of a "newline".

welcome to the world of cross-platform incompatibilities.

you might know windows has even a different convention;
it uses a combination of a carriage-return-and-line-feed
as _its_ "newline" indicator.

because one incompatibility is never enough, is it?

now, since your web-browser might well do you the favor
of translating the carriage-returns in that myant.zml file
into the appropriate newline character on _your_ machine
when it displays that file to you, you might not have realized
that that file itself has the "wrong" newline characters in it...

but our little perl program takes each character _literally_...

so when my "split" command -- and marcello's as well --
went looking for 5 consecutive line-feeds, it found _zero_
occurrences, so it didn't actually do the split we wanted...

if you go looking for consecutive-line-feeds in a file that uses
carriage-returns for its newlines,   you won't find any line-feeds,
not a one, let alone 5 consecutive ones.

that is, the command didn't give us the _output_ we wanted,
because the _input_ file wasn't exactly like we thought it was.

thus, when i asked "what's wrong with this code?", i was asking
the wrong question.   there was _nothing_ wrong with the code;
the problem was the _assumption_ about the file's line-endings.

marcello's workaround for this problem was a good one:
do a global-replace on the file that changes _non-desired_
line-ending characters into the line-endings that we want.

he did that with this line:
>   $text =~ s/\r//g;?? # fix brain-dead M$-DOS and Mac line endings

maybe he didn't realize the line-ending problem right away,
since his "brain-dead" comment _might_ indicate frustration...

after all, no one newline is "right" or "wrong", any more than
driving on the "right" side of the road is the "right" way to do it.
over in england, they drive on the left side of the road; that isn't
"brain-dead", it's just a _different_ way of doing things, that's all.

but this line-ending confusion _is_ a constant hassle, i'll tell you.
(and since the mac is in the minority, i have to face it all the time.
so i hope you'll excuse me for reversing the problem on all of you.)


so, in general, if you'll be dealing with files that you haven't created,
and thus don't know what line-endings they use, doing a conversion
-- right after you're read any file in -- is the _best_ way to proceed...

you _can_ also write your code so it'll be oblivious to line-endings,
but sometimes that can get unnecessarily complex and unwieldy...

another option -- _if_ you're working with files that you create
and maintain, is to standardize the line-ending used in the files.
(nonetheless, as a matter of course, the conversion doesn't hurt;
at worst, it will do nothing, but you will have the peace of mind.)

still another option is to maintain separate versions of each file,
one for each line-ending, so you can call in the appropriate one.

while i don't typically recommend it (why maintain separate files?)
this last option is the one that i'll use for the rest of this exercise.
my code will load a file named "myant-lf.zml" (_not_ "myant.zml").
(indeed, i snuck it in on the previous exercise, for babelfish06.pl.)

however, when i _talk_ about the file, _i_ will still call it "myant.zml".
i'm not doing this _deliberately_ to confuse you (ok, maybe a little),
the purpose is to serve as a constant reminder of this little irritation
so you don't let yourself -- or your routines -- get tripped up by it.

it's also a general warning that program routines are quite literal;
if you _tell_ them to look for a line-feed, they'll look for a line-feed,
even if what you _really_ wanted was for them to look for a newline
(whether that be a carriage-return or a line-feed or a combination).

the lesson is that it's very important for you to be very precise when
telling routines what you want them to do.   they'll do what you _say_,
provided you say it correctly, and not necessarily what you _mean_...

now hey, maybe you non-programmers are thinking to yourselves,
"didn't he tell us that he wouldn't bog us down in program-speak?"

yes, i did, and i'm sorry if this has strayed a little too close to the line,
but the lesson applies to you guys too, not just to us programmers...

when you tell us the features that you want, you need to be specific!

if you can specify -- in terms of the actual reality of the file as is --
how to obtain a feature that you want, that will almost _guarantee_
that you will get that feature.

***

ok, so our next assignment is to split the text into sections.

so here we go with assignment #7:   output the 21st section.

>    ####################### output the 21st section
>    @onesection=split("\n\n\n\n\n",$thebook); foreach $onesection 
(@onesection) {      
>    $nn++; if ($nn = "21" ) {print $onesection}};
>    #   pseudo-code:   split into sections and output the 21st

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish07.pl

as you can imagine, later on we'll have a need to output sections,
so this is an extremely important piece of code.   in just a few lines.

***

splitting into sections can also help us formulate a "table of contents".
the section's title is the first line, so we'll just skim it for each 
section.

assignment #8:   let's skim the header off each section, and list them...

>    ####################### output section-header lines
>    $thebook=substr($thebook,40);
>    @onesection=split("\n\n\n\n\n",$thebook); foreach $onesection 
(@onesection) {      
>    $tit=substr($onesection,0,200); $tit=~s/^\s+//; $ccc++; if ($ccc <= "9") 
{print "o"};
>    @oneline=split("\n",$tit); $nnn=0; foreach $oneline (@oneline) {
>    $nnn++; if ($nnn eq "1" ) {print "$ccc -- $oneline\n"}}}
>    #   pseudo-code:   skim the header of each section and output it

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish08.pl

ok, that's nice enough -- indeed, it is _tantilizingly_ close --
and that means you can probably guess what we'll want next.

****

assignment #9:   let's make that "table of contents" into hotlinks...

>    ####################### output hotlinked table-of-contents
>    print "<small><small>"; @thechap=split("\n\n\n\n\n",$thebook);
>    $pp=1; $past="myantc001"; foreach $thechap (@thechap) {
>    $nn++; if ($nn ne "0" and $nn ne "1")   {if ($pp<10) {print "o"}; 
>    print $pp; $pp++; print " -- "; $printme=substr($thechap,0,200);
>    if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)};
>    if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)};
>    if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)};
>    if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)};
>    if (substr($printme,0,1) eq "\n") {$printme=substr($printme,1,50)};
>    $ttt=0; @thetitle=split('\n',$printme); foreach $thetitle (@thetitle) {
>    $ttt++; if ($ttt eq 1) {
>    print '<a href="http://www.greatamericannovel.com/myant/';
>    print $past; print '.html">'; print substr($thetitle,1); print '</a>';
>    print "\n"; $past=substr($thechap,length($thechapter)-55,500);
>    for ($i=0;$i<=99;$i++) {
>    if (substr($past,0,5) ne "myant") {$past=substr($past,1)};
>    if (substr($past,0,5) eq "myant") {$i=99};
>    } $past=substr($past,0,9);}}}}
>    #   pseudo-code:   split into sections, strip headers, and output

you can see the results of this code by running the new script:
>    http://www.greatamericannovel.com/scgi-bin/babelfish09.pl

wow.

now we're talking.

a completely _hotlinked_ table-of-contents for this book, executed
in just a dozen-and-a-half lines of hack-something-together code.

we can see this kind of thing being _useful_.

and we've only just started.

wow.

***

so let's start to wrap it up for today, ok?

before i go, i'd like to point you to one file in the "my antonia" set:
>    http://www.greatamericannovel.com/myant/myantp123.html

this file "validates", which i'm sure will make marcello very happy.
indeed, i even put in a link -- at the very _bottom_ of the page --
that submits the page to the validator to save him that little chore.
just click on "make my day", marcello, for your precious validation.

not only that, but i took out the "font" tag to make david happy too,
since he was concerned, as that tag is now "deprecated".   (oh no!)

of course, since i'm now using a header tag -- "h6", to be exact --
for the pagenumber, david might start whining about "tag abuse".

just goes to show how hard it is to make obsessives happy.      :+)

so let me put out this general call:   if anyone here wants to make
a template for our little open-source project here, please feel free!
make it as totally-standards-compliant as your little heart desires!

don't leave it up to me, folks.   as long as something _works_,
i'll be happy with it, no matter what the standards mafia says.

so if you want to save the world from non-standardized markup,
you'll need to step up with a solution.   or i'm deaf to your whining.

in terms of what i'd _like_, though, here's what i can tell you...

first, i'd like to be able to lay in a background -- like this one --
>    http://snowy.arsc.alaska.edu/bowerbird/misc/goodbook.jpg
that will _resize_itself_automatically_ to the window's exact size.
i don't know how to do this using .css, even though the need for
such a capability would appear to be totally obvious.   (it is to me.)

next, i'd like to have a 2-column layout where i can flow text
in each column separately.   (this i can pretty much do already,
although back when i was working on it, i seem to remember
there were some little niceties i wanted to introduce into it, and
now i can't remember what they were.)   also nice would be if you
can have the font-size increase until it fills its column, so it grows
as big as it can possibly be, while all of it remains in the window
(that is, so the end-user doesn't have to scroll down to see it).

one more thing;   if you know a way to force-justify, it'd be nice.

i can do all of these things in my offline apps, and i'd like to
have the server-side version look just as nice as those apps.

thanks for your contribution to our open-source gutvol-d project!

***

ok, here's an "executive summary" of the exercises we did today;
the first line tells us the assignment, the second gives us the u.r.l.

>    ####################### output file after reading it in
>    http://www.greatamericannovel.com/scgi-bin/babelfish01.pl

>    ####################### output 3 predetermined pages
>    http://www.greatamericannovel.com/scgi-bin/babelfish02.pl

>    ####################### output pages 111, 222, 333
>    http://www.greatamericannovel.com/scgi-bin/babelfish03.pl

>    ####################### output pages with "breeze"
>    http://www.greatamericannovel.com/scgi-bin/babelfish04.pl

>    ####################### output words #75319-75357
>    http://www.greatamericannovel.com/scgi-bin/babelfish05.pl

>    ####################### output all lines, numbered
>    http://www.greatamericannovel.com/scgi-bin/babelfish06.pl

>    ####################### output section number 22
>    http://www.greatamericannovel.com/scgi-bin/babelfish07.pl

>    ####################### output header of each section
>    http://www.greatamericannovel.com/scgi-bin/babelfish08.pl

>    ####################### output hotlinked table-of-contents
>    http://www.greatamericannovel.com/scgi-bin/babelfish09.pl

at this time, it's probably worth a step back to look at the big picture.

the ability to take an e-text file and spit out its pages of text, and to
do searches on it, and to provide a hotlinked table-of-contents page, 
is well-and-good for _one_ book.   but that is not all we have here...

no, what we have here is a tool that can enable us to do these things
for _20,000_ e-texts.   and we made it in a matter of _mere_minutes_
(ok, many hours for me to write it up, but you can copy it in seconds),
with fewer than 100 lines of code, copied out of a primer on perl...

think about that.

and we've only just begun.

so ok, now who wants to port this code into python, or php, or ruby?
c'mon, don't be shy, that's what open-source projects are all about...

anyway, that's a good day's work.   there'll be more in coming days...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/11641e48/attachment-0001.html
From desrod at gnu-designs.com  Mon Oct 30 14:43:32 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Mon Oct 30 14:44:31 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
In-Reply-To: <c14.79d1e63.3277cf5e@aol.com>
References: <c14.79d1e63.3277cf5e@aol.com>
Message-ID: <1162248212.5857.1.camel@localhost.localdomain>

On Mon, 2006-10-30 at 16:57 -0500, Bowerbird@aol.com wrote:
> no, what we have here is a tool that can enable us to do these things
> for _20,000_ e-texts.  and we made it in a matter of _mere_minutes_
> (ok, many hours for me to write it up, but you can copy it in
> seconds), with fewer than 100 lines of code, copied out of a primer on
> perl... 

Its obvious from reading the snippets, that it is indeed copied out of a
rudimentary Perl primer, and not touched by anyone who has a strong
grasp of the power of the language at hand. 

Exactly what is it you are trying to prove with this anyway? We know how
to write parsers that can chew up and spit out a Gutenberg etext into
other formats, I don't think that's the core of the problem here. 


-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/3a5a7431/attachment.bin
From Bowerbird at aol.com  Mon Oct 30 17:07:43 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Mon Oct 30 17:07:49 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
Message-ID: <beb.5d8c052.3277fbdf@aol.com>

david said:
>    Exactly what is it you are trying to prove with this anyway? 

sorry, the time for the endless listserve merry-go-round is done.

it was a fun run, wish you woulda been here for the whole thing.

but it's pudding time now.

if you want to put your questions on a publicly-editable
wiki somewhere, where we can refer future questioners,
so these topics don't get raised over and over as a mere 
_stalling_ tactic, then i'll be happy to answer them, there.

but right here, right now, it's straight-ahead only.

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061030/50356ec0/attachment.html
From marcello at perathoner.de  Tue Oct 31 08:52:13 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Tue Oct 31 08:52:18 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
In-Reply-To: <1162248212.5857.1.camel@localhost.localdomain>
References: <c14.79d1e63.3277cf5e@aol.com>
	<1162248212.5857.1.camel@localhost.localdomain>
Message-ID: <45477F3D.70205@perathoner.de>

David A. Desrosiers wrote:

> Its obvious from reading the snippets, that it is indeed copied out of a
> rudimentary Perl primer, and not touched by anyone who has a strong
> grasp of the power of the language at hand. 

He's a baby that makes poo in the chamberpot for the first time and
thinks his parents are watching him because they want poo.


> Exactly what is it you are trying to prove with this anyway? We know how
> to write parsers that can chew up and spit out a Gutenberg etext into
> other formats, I don't think that's the core of the problem here. 

He's just inventing warm water (and trying to get credit for it).

This parser is online. It converts any PG text into a plucker database.
And it is open source and written in gasp! python. We have served
130,000 plucker texts in October this way. The only guy who hasn't
noticed yet is him who notices everything.

There are a few other PG parsers around like GutenMark and my PG to TEI
converter. All of them are open source and working today. So its only
natural that you-know-who will hold his non-working
at-the-rate-its-going-never-to-be-released zml parser against them, just
for the fun of causing confusion. Ever wondered who pays him to fuzz and
fudge?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From desrod at gnu-designs.com  Tue Oct 31 09:28:49 2006
From: desrod at gnu-designs.com (David A. Desrosiers)
Date: Tue Oct 31 09:29:35 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
In-Reply-To: <45477F3D.70205@perathoner.de>
References: <c14.79d1e63.3277cf5e@aol.com>
	<1162248212.5857.1.camel@localhost.localdomain>
	<45477F3D.70205@perathoner.de>
Message-ID: <1162315729.5921.0.camel@localhost.localdomain>

On Tue, 2006-10-31 at 17:52 +0100, Marcello Perathoner wrote:
> This parser is online. It converts any PG text into a plucker
> database. And it is open source and written in gasp! python. We have
> served 130,000 plucker texts in October this way. 

Plucker? I've heard of that application... <grin> 

-- 
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/c7def110/attachment.bin
From marcello at perathoner.de  Tue Oct 31 11:34:46 2006
From: marcello at perathoner.de (Marcello Perathoner)
Date: Tue Oct 31 11:34:49 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
In-Reply-To: <c14.79d1e63.3277cf5e@aol.com>
References: <c14.79d1e63.3277cf5e@aol.com>
Message-ID: <4547A556.5000001@perathoner.de>

Bowerbird@aol.com wrote:

> hi.   this is my one post for 2006-october-30rd.

This is just about programming. Why don't you post this to the
appropriate list?

  mailto: gutvol-p@lists.pglaf.org


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Tue Oct 31 12:48:58 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 31 12:49:11 2006
Subject: [gutvol-d] gvd061030 -- let's get it started in here
Message-ID: <c34.4e26dd5.327910ba@aol.com>


i told you there'd be flack...           :+)

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/682d3da5/attachment.html
From Bowerbird at aol.com  Tue Oct 31 12:51:54 2006
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Oct 31 12:52:04 2006
Subject: [gutvol-d] gvd061031 -- any more reaction to duguid?
Message-ID: <ce9.993430.3279116a@aol.com>

so is there any more reaction to the duguid article?

i wanna make sure everyone has had a chance to
say what they think before i tell you what i think...

-bowerbird
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20061031/0b3b06a6/attachment.html