From tb at baechler.net  Fri Apr  1 23:22:47 2005
From: tb at baechler.net (Tony Baechler)
Date: Fri Apr  1 23:20:44 2005
Subject: [gutvol-d] Braille files
Message-ID: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>

Hi.  This is probably mostly for the webmaster, but I would like to get 
input from others on this list.  Also, if you think this is off topic, I 
can move this discussion to the gutvol-p list instead.

Has any progress been made with regards to making PG books available in 
Braille formatted files?  There is at least one free Braille translator for 
*nix that I know of.  I haven't seen this on the recode or bibrec pages.  I 
would be willing to test the Braille output.  I know of others who would 
probably also be interested.  Probably a link on the recode page would be 
best, but it would be nice to see a link as part of the standard download 
links on the bibrec pages.  That link could probably be a cgi program which 
would call the translator, NFBTrans, which would convert the file.  I 
believe that it can be set to write to stdout, but it might be more 
convenient to have it create a separate file with the .brf extension for 
downloading.

Here are a couple of issues to keep in mind.  First, Braille only works 
with 7-bit, upper case files.  NFBTrans will convert from mixed case just 
fine, but it is important that they are not 8-bit files.  Braille does not 
support international accented characters.  There are different Braille 
codes for other languages such as Spanish or German, but I don't know how 
well NFBTrans supports them.  Besides it would be difficult to have an 
automated way of passing this information on the command line at the time 
of translation.  Second, it will work only on plain text.  It does not 
convert from html at all.  Third, it might be best to have two links, one 
for a single file and one for multiple volumes.  The reason why is that one 
printed page is about 3-6 Braille pages, depending on the size of the print 
and any illustrations.  Many blind people now use special PDAs which can 
handle big files, but large files are bad for embossers.  If someone 
downloads and tries to emboss a 300 page print book, it could easily be 
900-1,000 pages in Braille.  So, it is best to also make smaller files 
available.  This is known as volumes.  that same book could be downloaded 
in three Braille Volumes of 100-250 Braille pages each.  That is much 
better for binding etc.  I think the best way to do this is with the 
standard split utility.  You can set it for X lines and that would be 
perfect.  It would not indicate at the top of each new file that it is a 
subsequent volume in a set, but that can be added by the person doing the 
embossing if desired.  I don't know exactly what the correct number of 
lines per file is, but I can check into that.

Finally, after the translation to Braille options are set up and working, 
it would be nice if PG could have an official press release announcing this 
service.  I know of several groups for the blind and many blind individuals 
who would be very interested.  This is not as good of a solution as making 
books available in the new DAISY format, but I am not aware of a free html 
to DAISY converter, and I don't think that DAISY works on plain text at 
all.  Once a master xml format is in place, conversion to DAISY would be 
even simpler, but I don't think there are any free tools.

Really all that needs to be done is to integrate a script into the PG site 
that calls nfbtrans and generates the .brf file.  The .brf can be sent to 
the browser for downloading.  Windows users would probably have to 
right-click on the Braille links and save the files manually.  It would not 
be much harder for the multiple volumes, just use a split utility on the 
.brf file and create a temporary page with each piece of the file linked so 
they can also be downloaded and saved.

Does anyone have any thoughts on this?  I don't mind testing or finding 
others to test, but I admit that my knowledge of cgi scripting is almost 
zero.  I can compile nfbtrans easily enough, but I don't know how to 
integrate it into the recode facility or the PG site in general.

From tb at baechler.net  Fri Apr  1 23:23:05 2005
From: tb at baechler.net (Tony Baechler)
Date: Fri Apr  1 23:21:01 2005
Subject: [gutvol-d] opal-online.org
Message-ID: <5.2.0.9.0.20050401232258.01f91640@baechler.net>

Hello all.  I recently found this site and am wondering if PG could either 
work with them or make use of their services.

http://www.opal-online.org/

One thing they offer is a voice over ip chat facility, so perhaps someone 
can make a presentation about what PG is and what they have to offer.  They 
also offer podcasts of some programs so people can get an idea of what they 
offer.  They might be for the blind, but I don't think they are 
specifically designed just to appeal to the blind.  If someone wants to 
contact them and schedule a presentation, it might get PG some more 
publicity.  If this link is already known or people determine that it is of 
no use, then my apologies for mentioning it.

From marcello at perathoner.de  Sat Apr  2 12:32:36 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Sat Apr  2 12:41:27 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>
References: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>
Message-ID: <424F0164.2000309@perathoner.de>

Tony Baechler wrote:

> Really all that needs to be done is to integrate a script into the PG 
> site that calls nfbtrans and generates the .brf file.  The .brf can be 
> sent to the browser for downloading.  Windows users would probably have 
> to right-click on the Braille links and save the files manually.  It 
> would not be much harder for the multiple volumes, just use a split 
> utility on the .brf file and create a temporary page with each piece of 
> the file linked so they can also be downloaded and saved.

Piping the files thru nfbtrans should pose no problem at all.

Question: won't every blind computer user have this program on his PC 
already? Won't she be better able to tailor the output to her needs if 
the program is run locally?


-- 
Marcello Perathoner
webmaster@gutenberg.org


From tony at baechler.net  Fri Apr  1 19:31:41 2005
From: tony at baechler.net (Tony Baechler)
Date: Sun Apr  3 13:18:50 2005
Subject: [gutvol-d] opal-online.org
Message-ID: <5.2.0.9.0.20050401192829.0264c210@baechler.net>

Hello all.  I recently found this site and am wondering if PG could either 
work with them or make use of their services.

http://www.opal-online.org/

One thing they offer is a voice over ip chat facility, so perhaps someone 
can make a presentation about what PG is and what they have to offer.  They 
also offer podcasts of some programs so people can get an idea of what they 
offer.  They might be for the blind, but I don't think they are 
specifically designed just to appeal to the blind.  If someone wants to 
contact them and schedule a presentation, it might get PG some more 
publicity.  If this link is already known or people determine that it is of 
no use, then my apologies for mentioning it.

From gbnewby at pglaf.org  Sun Apr  3 14:11:22 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Sun Apr  3 14:11:23 2005
Subject: [gutvol-d] opal-online.org
In-Reply-To: <5.2.0.9.0.20050401192829.0264c210@baechler.net>
References: <5.2.0.9.0.20050401192829.0264c210@baechler.net>
Message-ID: <20050403211122.GA31719@pglaf.org>

On Fri, Apr 01, 2005 at 07:31:41PM -0800, Tony Baechler wrote:
> Hello all.  I recently found this site and am wondering if PG could either 
> work with them or make use of their services.
> 
> http://www.opal-online.org/
> 
> One thing they offer is a voice over ip chat facility, so perhaps someone 
> can make a presentation about what PG is and what they have to offer.  They 
> also offer podcasts of some programs so people can get an idea of what they 
> offer.  They might be for the blind, but I don't think they are 
> specifically designed just to appeal to the blind.  If someone wants to 
> contact them and schedule a presentation, it might get PG some more 
> publicity.  If this link is already known or people determine that it is of 
> no use, then my apologies for mentioning it.

Hi, Tony.  This sounds worth pursuing.  However, I don't think
we have a "someone" working here :-)

Please, feel free - and empowered - to introduce PG to them,
and pursue any viable cooperative efforts.  We do have a few
ongoing discussions with librarians (i.e., for cataloging &
for distributing MARC-format records), but they're not far along
enough for the opal-online folks to just jump aboard.

This looks like a worthwhile relationship to look into.
  -- Greg

From gbnewby at pglaf.org  Sun Apr  3 14:25:15 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Sun Apr  3 14:25:17 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <424F0164.2000309@perathoner.de>
References: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>
	<424F0164.2000309@perathoner.de>
Message-ID: <20050403212515.GD31719@pglaf.org>

On Sat, Apr 02, 2005 at 10:32:36PM +0200, Marcello Perathoner wrote:
> Tony Baechler wrote:
> 
> >Really all that needs to be done is to integrate a script into the PG 
> >site that calls nfbtrans and generates the .brf file.  The .brf can be 
> >sent to the browser for downloading.  Windows users would probably have 
> >to right-click on the Braille links and save the files manually.  It 
> >would not be much harder for the multiple volumes, just use a split 
> >utility on the .brf file and create a temporary page with each piece of 
> >the file linked so they can also be downloaded and saved.
> 
> Piping the files thru nfbtrans should pose no problem at all.
> 
> Question: won't every blind computer user have this program on his PC 
> already? Won't she be better able to tailor the output to her needs if 
> the program is run locally?

I think I corresponded with Tony about this a couple of years
ago.  I had no less than three programmers working on "conversion
on the fly" which would generate formats including Braille,
MP3, and others from .txt, .htm or .xml source files.

Unfortunately, none of these ever became complete enough to
offer on the Project Gutenberg download site.  

I am still very much able to provide a programming platform (a Linux
server, with plenty of space and a copy of the PG collection)
to people who might want to develop a CGI, PHP, Web services,
or other platform for this type of functionality.  Most of the
tools already exist (i.e., for .txt to HTML, or Braille), but
it's still a complex problem due to the complex nature of our
collection.  (That is, lots of different files & formats to
choose from.)

While Marcello is correct that many blind or visually impaired computer
users already have nfbtrans (or something similar), I still think a
general purpose conversion on-the-fly between formats is useful.  And,
if we offer this functionality, then an option to convert to Braille via
nfbtrans is a very easy addition.  There are just a few options for the
output....like all of the other transformation programs...

Long-timers on this list are getting tired of me talking about
conversion on the fly, I know.  Plus, this inevitably leads to a
discussion of eBooks being "born as" XML, which I have also tried to
facilitate.  People who are newer (or newly-energized!)  are welcome to
look at viable methods for delivering some of this functionality.
Beware, though: it's a larger and more complex problem than you might
guess at first.

Some folks will even remember that I offered a "bounty" reward
for completely functional applications.  This offer still stands.

  -- Greg

From marcello at perathoner.de  Sun Apr  3 14:46:39 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Sun Apr  3 14:46:22 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <20050403212515.GD31719@pglaf.org>
References: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>	<424F0164.2000309@perathoner.de>
	<20050403212515.GD31719@pglaf.org>
Message-ID: <4250643F.3080409@perathoner.de>

Greg Newby wrote:

> I think I corresponded with Tony about this a couple of years
> ago.  I had no less than three programmers working on "conversion
> on the fly" which would generate formats including Braille,
> MP3, and others from .txt, .htm or .xml source files.
> 
> Unfortunately, none of these ever became complete enough to
> offer on the Project Gutenberg download site.  

This is a matter of a few hours ... its almost the same as the file 
recode service.

I just need somebody to work out the nfbtrans command line options that 
work best for our etexts. Anybody got a braille embosser to test?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Sun Apr  3 17:36:22 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sun Apr  3 17:36:40 2005
Subject: [gutvol-d] re: viable methods and completely functional applications
Message-ID: <1e0.3956f0e4.2f81e606@aol.com>

greg said:
>   People who are newer (or newly-energized!)  are welcome to
>   look at viable methods for delivering some of this functionality.
>   Beware, though: it's a larger and more complex problem 
>   than you might guess at first.

perhaps it would help if you laid out the complexity of the problem.


>   Some folks will even remember that I offered a "bounty" reward
>   for completely functional applications.  This offer still stands.

perhaps if you were clear about the capabilities you want to deliver,
rather than the means of obtaining them, you might have better luck.

-bowerbird
From gbnewby at pglaf.org  Sun Apr  3 20:20:54 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Sun Apr  3 20:20:56 2005
Subject: [gutvol-d] More PG spam being spread around
In-Reply-To: <000a01c5352c$9790a780$0c9495ce@gw98>
References: <000a01c5352c$9790a780$0c9495ce@gw98>
Message-ID: <20050404032054.GA6488@pglaf.org>

On Wed, Mar 30, 2005 at 08:29:50AM -0500, N Wolcott wrote:
> Resellers of PG books have taken on a new target, Lulu.com.

Thanks for sending this info, Norm.  I talked with Lulu
a couple of years ago, and have been in recent communication
with them.  I think the titles you found are just early
experiments (though not hidden from public view, as they
should be).

I've told Lulu that our items must be clearly indicated
as public domain.  If we can agree on things (which seems
likely - they've been pretty open to discussion, so far),
I see POD as a viable option for people who want our stuff.
Maybe a "buy it now" link in our catalog hits...maybe some
other way of allowing people who want print to get it.

Compared to other publishers and book resellers I've worked
interacted with over the years, Lulu is unusually receptive
to our demands and tendencies.  This is probably because
they have such diversity in their author base.

Specific advice or demands or qualifications on how Lulu
might sell our stuff would be welcome.  It looks like they're
using some templates for our content, so we could add a lot
of general boilerplate.  Plus, as I said, a clear statement
that the whole book (cover to cover, including the covers)
is public domain, and suitable for unlimited redistribution.
  -- Greg

> Lulu offers POD publishing at zero up front cost, thus luring those who find free advertising for their spam. The postings I have seen so far both imply PG and Lulu are supporting thier spam. They advertise the quality of their texts as being from PG. One ofthem admits there may be errors. 
> 
> There is probably nothing for PG to do except to get Lulu to take the PG off their customer's postings. If they want to host 15000 books on their computers for free that is their business. I quote my post to the LuLu foruml I have posted 2 books to Lulu at 15 cent royalty with added content to the PG text and I do not mention PG in the blurb. My "quality" book may soon be submerged in a flood of lulu spam. 
> 
> Posting follows:
> -------------------------
> Lulu offers a good service for self publishers who provide "content added" material. This offers the publisher to continually upgrade the product until it is in final form then market it through Lulu's various mechanisms. 
> 
> However recently public domain texts lifted from project gutenberg have been appearing on Lulu. The accomopanying blurb states that www.lulu.com and Project gutenberg have joined forces to offer you these long out of print books. The implication is that somehow Lulu and PG are supporting this effort. PG is trademarked and there is no right to use the name in advertising; enforcing the trademark is another thin however for an all volunteer organization. 
> 
> Software exists to move PG texts to a number of formats, ipod, ebook, etc including Lulu. So there is a real possibillity that most of the 15000 pg books could end up being hosted on Lulu. No review copy would ever be required, so the posting for the converter would be free. Lulu could end up hosting the entire pg corpus for free in a kind of publishing spam. The books are listed with a royalty of $1 to $2.  One is published with a $1.59 royalty, and claims that $1 will be contriputed to PG of every book sold. This leaves only 27 cents for the seller. 
> 
> Iin one case the publisher had re-copyrighted the book and in the other had listed it as Public Domain. Nothing wrong with this, but the copyright only applies to "new material" and certainly not the entire book. In one case a ISBN number was listed, so Lulu might have gotten some revenue from that if the ISBN is real. One of the books was listed as 5000 in sales, so I imagine that is how many Lulu has in its archive. It may soon get 14,999 more!
> 
> Another feature with Lulu is you never know who is selling the book. Lulu distributes it, but the real seller is someone else, unknown. This may raise legal issues about ultimate responsibilitiy. 
> 
> People like myself who provide added content at no or minimal royalty will be unhappy to see our listing efforts buried in an avalanche of Lulu spam. At the very least Lulu should require permission before violating trademark laws. 
> 
> To see the books in this post, search for "Verne" on Lulu. 
> 
> The additional cost of hosting all these books could end up in forcing up front charges on Lulu providers or radically restructuring the way Lulu operates,  neither of which is desirable in my humble opinion. 
> 
> I mention this as a discussion topic, as I feel it is an emerging problem.
> 
> ---------------------
> N Wolcott  nwolcott2@post.harvard.edu

> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From tb at baechler.net  Sun Apr  3 23:24:30 2005
From: tb at baechler.net (Tony Baechler)
Date: Sun Apr  3 23:24:50 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <424F0164.2000309@perathoner.de>
References: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>
	<5.2.0.9.0.20050401230106.01f917d0@baechler.net>
Message-ID: <5.2.0.9.0.20050403231750.047290a0@baechler.net>

Hello.  Surprisingly, many people do not have Braille 
translators.  NFBTrans is the only free, open source translator I know 
of.  Most cost many hundreds or thousands of dollars.  In the case of PDAs, 
they can do translation but not reliably.  There are many different types 
of Braille codes.  There is Braille music, computer Braille, literary 
Braille and a special mathematics code.  What we want is literary Braille 
since we are dealing with books.  However, most PDAs and embossers would 
only output computer Braille which is harder to read and takes up more 
pages.  If you look at a Braille embosser formatted file and compare it to 
the plain text, you will see that the Braille file is usually slightly 
smaller.  That is because of contractions and other things to shorten 
words.  For example, "sh" by itself in Braille is short for shall.  "W" by 
itself is will.

Unless such blind people are familiar with a command line or run Linux, 
most likely they wouldn't have access to a Braille translator and would 
probably appreciate Braille files being available directly from the PG 
download pages.  I can query a few blind-specific mailing lists if you need 
more exact stats on how many people would be interested.  It would 
certainly raise the reputation of PG if a press release was issued and the 
capability was implemented.

At 10:32 PM 4/2/2005 +0200, you wrote:

>Piping the files thru nfbtrans should pose no problem at all.
>
>Question: won't every blind computer user have this program on his PC 
>already? Won't she be better able to tailor the output to her needs if the 
>program is run locally?
>
>
>
>--
>Marcello Perathoner
>webmaster@gutenberg.org
>
>
>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d
>

From tb at baechler.net  Mon Apr  4 00:34:26 2005
From: tb at baechler.net (Tony Baechler)
Date: Mon Apr  4 00:34:51 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <20050403212515.GD31719@pglaf.org>
References: <424F0164.2000309@perathoner.de>
	<5.2.0.9.0.20050401230106.01f917d0@baechler.net>
	<424F0164.2000309@perathoner.de>
Message-ID: <5.2.0.9.0.20050404002144.01f89c80@baechler.net>

Hi Greg.  If you corresponded with me, I have no recollection of it.  I 
would like to see this happen, but I don't remember any discussion with 
you.  I am the first to admit that I am not a programmer.  I appreciate 
your offer, and it would be useful for compiling and testing nfbtrans, but 
I have no idea how to set up a script that would produce Braille files on 
the fly.  Also, you said that it would be very hard to create Braille 
output because of the many formats involved.  I couldn't disagree with you 
more on this.  All Braille needs is plain text.  Currently, nfbtrans won't 
work with other formats except text and a few language source code 
files.  Simply pipe the 7-bit plain text through nfbtrans and don't worry 
about the format.  My understanding is that even a master xml format will 
still have a 7-bit equivalent.  For non-text, just don't offer a Braille 
option.  There is probably a way to translate mathematics, but I am not 
aware of it with nfbtrans and it would add complexity because the .pdf or 
.tex would first need to be converted to 7-bit ASCII.

I think that if nothing else, PG should offer Braille output just because 
the motto is that the files should be able to be viewed by anyone with any 
equipment.  In many ways, literary Braille is similar to very old formats 
in that it is only plain text and is all upper case.  Yet that is still 
what many blind people use, including the US Library of Congress.

I have a question.  How is it that you come to the conclusion that most 
blind people already have Braille translation software?  I have read stats 
that no more than 12% of the blind can read at all.  Of those, I would 
guess that not many have the computer knowledge to use a command line 
program such as nfbtrans.  It is still a DOS-based program unless you want 
to compile the sources under Linux.  I am not sure if development is still 
being done.  I am not currently aware of an official download site but I 
can check into this if this is something that we're willing to move ahead 
on.  I would be happy to look at the Linux server Greg is offering, but 
again I am no programmer so unless it can be done in a one or two line 
script, I'm rather helpless.  Also, nfbtrans has many, many command line 
switches.  It is designed to format Braille books, so offers facilities for 
running headers, table of contents, etc.

From gbnewby at pglaf.org  Mon Apr  4 01:04:27 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Mon Apr  4 01:04:28 2005
Subject: [gutvol-d] Enlightened self-interst
In-Reply-To: <421F547A.1080007@zytrax.com>
References: <421F547A.1080007@zytrax.com>
Message-ID: <20050404080427.GA11773@pglaf.org>

On Fri, Feb 25, 2005 at 11:38:18AM -0500, Ron Aitchison wrote:
> Having discovered Jane Austen regrettably late in life I have 
> down-loaded a couple of novels and since I find the raw text format 
> unpleasant to read I have reformatted for my own use.
> It seems to me since I have the ability to produce PDFs and OpenOffice 
> formats and even - heaven forfend - MS doc format should they be wanted, 
> it would be churlish not to make such an offer.
> If you can point me at a standard for PDF, page width, font size etc, 
> etc., and let me know what formats you do want I would be happy to 
> undertake the small additional work for the two novels I have currently 
> downloaded.
> I cannot supply DocBook at this time but hope to have that available 
> shortly.

Hi, Ron.  I don't think your offer ever got a response.
Sorry about that!

We would love to have HTML for our Austen titles that are
missing it.  Generally, we're a little cool on .doc,
.pdf, .sxw, etc. due to fears for their longevity, and
difficulty in applying fixes.  In the future we hope to
have much more XML, but are still working on getting the
production stream moving for it.

If you produce some new formats, please first consult
the FAQ on production guidelines: http://gutenberg.org/faq
Then, email pgww@pglaf.org and we'll arrange some good
ways for you to get them to us (depending on how many
you want to prepare, etc.).

Thanks again for this offer!
  -- Greg


> Regards
> 
> -- 
> Ron Aitchison
From tb at baechler.net  Mon Apr  4 01:15:55 2005
From: tb at baechler.net (Tony Baechler)
Date: Mon Apr  4 01:16:17 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <4250643F.3080409@perathoner.de>
References: <20050403212515.GD31719@pglaf.org>
	<5.2.0.9.0.20050401230106.01f917d0@baechler.net>
	<424F0164.2000309@perathoner.de> <20050403212515.GD31719@pglaf.org>
Message-ID: <5.2.0.9.0.20050404011003.02c6c830@baechler.net>

Hi.  I can help you with this.  Please give me a detailed list of what you 
want/need tested and I will ask people to look into it.  I know of at least 
two mailing lists that would have blind subscribers interested in looking 
at this.  Remember that we need to keep in mind that many people will use 
PDAs designed for the blind.  There is no real difference except that they 
don't need formfeeds while embossers do and generally one big file works 
better than multiple volumes.  For embossers, formfeeds are a help, but 
nfbtrans puts that in automatically once you set the page length.  You 
generally want the page length at 25 lines and 40 columns.  If you can put 
files somewhere for people to download and test or somehow find a way for 
people to experiment with various kinds of output, that would be best.  I 
think there might be someone on one of the lists that uses nfbtrans and 
could help more with the switches.  Unfortunately, even among the blind 
it's mostly a Windows world so few people know how to use the command line.

At 11:46 PM 4/3/2005 +0200, you wrote:
This is a matter of a few hours ... its almost the same as the file recode 
service.

>I just need somebody to work out the nfbtrans command line options that 
>work best for our etexts. Anybody got a braille embosser to test?

From shimmin at uiuc.edu  Mon Apr  4 06:40:38 2005
From: shimmin at uiuc.edu (Robert Shimmin)
Date: Mon Apr  4 06:40:41 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <20050403212515.GD31719@pglaf.org>
References: <5.2.0.9.0.20050401230106.01f917d0@baechler.net>	<424F0164.2000309@perathoner.de>
	<20050403212515.GD31719@pglaf.org>
Message-ID: <425143D6.4040909@uiuc.edu>

I can offer little commentary about Braille, but as far as audio formats 
are concerned, I can say that the visually impaired persons I have known 
who listen to substantial amounts of text often listen to it at 
substantially faster-than-intended speeds, and some do some other 
transformations on the audio to help them better catch meaning at these 
high speeds.  Such things seem to be idiosyncratic to the individual 
listener.

A fairly 'raw' format, such as .wav, would I think be useful to such 
users, but lossy compression formats have been engineered around 
assumptions about listening conditions that simply aren't true for the 
practices of those who listen to recorded speech as a way of life, and 
may be of little practical value.

-- RS

From kouhia at nic.funet.fi  Mon Apr  4 07:31:06 2005
From: kouhia at nic.funet.fi (Juhana Sadeharju)
Date: Mon Apr  4 07:31:15 2005
Subject: [gutvol-d] Re: Scanner vs. digital camera
Message-ID: <S16144AbVDDObG/20050404143106Z+5949@nic.funet.fi>

>From: Carlo Traverso <traverso@dm.unipi.it>
>
>    Juhana> ftp://ftp.funet.fi/pub/sci/audio/devel/books/
>
>Please, instead of putting there a big tar.gz file of 72MB, can you
>put some individual images?

Now there is, sorry.
The images at the end of list are good for OCR testing.
The images at the begin are test images: with/without flash,
with normal/with "flower" focus.

>Indeed, my attempts with a good digital camera (5Mpixels, manual
>focus, uncompressed tiff output, a special mode for text, a
>professional tripod, etc) have been poor.

Canon EOS 300D, or perhaps newer EOS 350D would do the same.
Nikon D70.
Minolta Dynax 7D (european model name, US has different name).

At least Canon and Nikon raw images have been converted with
an open source software. See gphoto webpages for links and details.
I'm not sure about the proper Linux interfacing. I should have it,
because I would like to write a special software for digitization.
(Canon has a proper SDK only for Windows.)

Juhana
-- 
  http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
  for developers of open source graphics software
From marcello at perathoner.de  Mon Apr  4 09:13:01 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Mon Apr  4 12:39:25 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <5.2.0.9.0.20050404011003.02c6c830@baechler.net>
References: <20050403212515.GD31719@pglaf.org>	<5.2.0.9.0.20050401230106.01f917d0@baechler.net>	<424F0164.2000309@perathoner.de>
	<20050403212515.GD31719@pglaf.org>
	<5.2.0.9.0.20050404011003.02c6c830@baechler.net>
Message-ID: <4251678D.5050705@perathoner.de>

Tony Baechler wrote:

> I think there might be 
> someone on one of the lists that uses nfbtrans and could help more with 
> the switches.

Just ask some people to download a selection of PG files, run them thru 
nfbtrans and print a few test pages on the embosser or read them on the 
PDA. You should get at least a dozen or so different files from 
different producers done in different years. Formatting of PG texts has 
changed over the years.

What we want is the set of commandline options that give the best 
results for most of the blind people out there.


-- 
Marcello Perathoner
webmaster@gutenberg.org


From fielden3 at aol.com  Mon Apr  4 13:22:03 2005
From: fielden3 at aol.com (Kent Fielden)
Date: Mon Apr  4 13:22:35 2005
Subject: [gutvol-d] Re: Scanner vs. digital camera
In-Reply-To: <200503311452.j2VEqdn29068@posso.dm.unipi.it>
References: <S16403AbVCaO0W/20050331142623Z+3259@nic.funet.fi>
	<200503311452.j2VEqdn29068@posso.dm.unipi.it>
Message-ID: <4251A1EB.8050704@aol.com>

Carlo Traverso wrote on 3/31/2005, 6:52 AM:
 > Indeed, my attempts with a good digital camera (5Mpixels, manual
 > focus, uncompressed tiff output, a special mode for text, a
 > professional tripod, etc) have been poor.

    I am suprised to hear this.  I use a Canon S230 3.2Mpixel pocket 
camera with results as good as my scanner for OCR for ABBYY FineReader 
5.0.  This is a relatively simple pocket camera.
    The one thing that took some real work is doing a good job of 
lighting the book.  I now use 2 lights mounted on each size of the 
camera (currently 13 watt fluorescent task lights, but normal 
incandescent lights worked as well).  I had no luck at all using the 
flash.  I use automatic focus, no flash, close-up mode, with a long 
exposure time.  I use a copy stand modified from a hand drill press to 
position the camera about 9" above the book.  I take each page 
separately, a 2k x 1.5k JPEG for each 7" by 4.5" page, or almost 300 
DPI.  The OCR results for 600 DPI, taking a picture of 1/2 the page were 
no better than the full page results.
    Clearly, especially for our purposes, the quality of the original 
makes some difference.
     How do the pictures you took look to you?  It has been my 
experience that if they looked like faithful reproductions, then they 
OCRed well.  It may be that your expectations of results are higher than 
mine.  If you are interested, I could send you a picture to see what I get.

Kent Fielden

From tb at baechler.net  Tue Apr  5 00:31:28 2005
From: tb at baechler.net (Tony Baechler)
Date: Tue Apr  5 00:31:47 2005
Subject: [gutvol-d] Braille files
In-Reply-To: <425143D6.4040909@uiuc.edu>
References: <20050403212515.GD31719@pglaf.org>
	<5.2.0.9.0.20050401230106.01f917d0@baechler.net>
	<424F0164.2000309@perathoner.de> <20050403212515.GD31719@pglaf.org>
Message-ID: <5.2.0.9.0.20050405002544.04176330@baechler.net>

Hi Robert.  I agree with you and I am blind, so I'm glad you finally said 
it.  I listen to text to speech at about 430 words per minute.  That is 
slower than some people.  I think it is great that PG has made some audio 
books available.  However, to be blunt about it, I find them awful to 
listen to.  They have low volume so are hard to hear and are very, very 
slow.  I can't stand speech that slow!  I know that the sighted public 
don't do well with computer speech, so I suppose it's necessary for them, 
but I would much rather have the speed set to at least 300 words per minute 
at the slowest.

I wouldn't necessarily agree with you about raw wave files.  I think mp3 is 
fine.  One very big Internet radio station for the blind is ACB 
Radio.  They are online at:

http://www.acbradio.org/

They use mp3 exclusively, and at a rather low bitrate.  I have no problem 
listening to it.  In general, I don't like mp3.  I collect old time radio 
and refuse to accept anything but raw wave files, but that is because it is 
of historical value and should be saved in the best audio condition possible.

At 08:40 AM 4/4/2005 -0500, you wrote:
>I can offer little commentary about Braille, but as far as audio formats 
>are concerned, I can say that the visually impaired persons I have known 
>who listen to substantial amounts of text often listen to it at 
>substantially faster-than-intended speeds, and some do some other 
>transformations on the audio to help them better catch meaning at these 
>high speeds.  Such things seem to be idiosyncratic to the individual listener.
>
>A fairly 'raw' format, such as .wav, would I think be useful to such 
>users, but lossy compression formats have been engineered around 
>assumptions about listening conditions that simply aren't true for the 
>practices of those who listen to recorded speech as a way of life, and may 
>be of little practical value.

From schultzk at uni-trier.de  Tue Apr  5 00:37:44 2005
From: schultzk at uni-trier.de (Keith J.Schultz)
Date: Tue Apr  5 00:39:45 2005
Subject: [gutvol-d] Re: Scanner vs. digital camera
In-Reply-To: <4251A1EB.8050704@aol.com>
References: <S16403AbVCaO0W/20050331142623Z+3259@nic.funet.fi>
	<200503311452.j2VEqdn29068@posso.dm.unipi.it>
	<4251A1EB.8050704@aol.com>
Message-ID: <c8e8ff0a7674e29b8e6b52e044eecc8c@uni-trier.de>

Hi,

	I have to see what my canon 20D will do in BW-mode.
	But far as resolution goes and OCR, you do not want to go any higher
	than 300 Dpi, because above 300 dpi the OCR starts to see the
	structure of the paper and makes mistakes. This is even more
	important with older books. Try just using 144 dpi this should
	give you the same results as 300 dpi.

		Keith.

Am 04.04.2005 um 22:22 schrieb Kent Fielden:

> Carlo Traverso wrote on 3/31/2005, 6:52 AM:
>> Indeed, my attempts with a good digital camera (5Mpixels, manual
>> focus, uncompressed tiff output, a special mode for text, a
>> professional tripod, etc) have been poor.
>
>     I am suprised to hear this.  I use a Canon S230 3.2Mpixel pocket
> camera with results as good as my scanner for OCR for ABBYY FineReader
> 5.0.  This is a relatively simple pocket camera.
>     The one thing that took some real work is doing a good job of
> lighting the book.  I now use 2 lights mounted on each size of the
> camera (currently 13 watt fluorescent task lights, but normal
> incandescent lights worked as well).  I had no luck at all using the
> flash.  I use automatic focus, no flash, close-up mode, with a long
> exposure time.  I use a copy stand modified from a hand drill press to
> position the camera about 9" above the book.  I take each page
> separately, a 2k x 1.5k JPEG for each 7" by 4.5" page, or almost 300
> DPI.  The OCR results for 600 DPI, taking a picture of 1/2 the page 
> were
> no better than the full page results.
>     Clearly, especially for our purposes, the quality of the original
> makes some difference.
>      How do the pictures you took look to you?  It has been my
> experience that if they looked like faithful reproductions, then they
> OCRed well.  It may be that your expectations of results are higher 
> than
> mine.  If you are interested, I could send you a picture to see what I 
> get.
>
> Kent Fielden
>
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>

From joshua at hutchinson.net  Tue Apr  5 05:20:05 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Tue Apr  5 05:19:57 2005
Subject: [gutvol-d] Re: Scanner vs. digital camera
Message-ID: <20050405122005.4FBE910989A@ws6-4.us4.outblaze.com>

144dpi is only going to work well with larger print books with very clean pages (read: modern fiction will usually work well at this DPI).  You get into some of the older, smaller print stuff and 144 dpi is going to fail miserably and at times spectacularly.

300 dpi is the happy medium in most cases.  Above that, stray marks on the paper can be misinterpreted at times by the OCR process.  However, on some fairly small print books, 600dpi has been necessary.

Josh

----- Original Message -----
From: "Keith J . Schultz" <schultzk@uni-trier.de>

> 
> Hi,
> 
> 	I have to see what my canon 20D will do in BW-mode.
> 	But far as resolution goes and OCR, you do not want to go any higher
> 	than 300 Dpi, because above 300 dpi the OCR starts to see the
> 	structure of the paper and makes mistakes. This is even more
> 	important with older books. Try just using 144 dpi this should
> 	give you the same results as 300 dpi.
> 
> 		Keith.

From geoff.horton at gmail.com  Tue Apr  5 06:19:00 2005
From: geoff.horton at gmail.com (Geoff Horton)
Date: Tue Apr  5 06:19:03 2005
Subject: [gutvol-d] Scanners vs. Cameras
Message-ID: <94e5f59605040506195852fdf6@mail.gmail.com>

I've posted a couple photographed books to DP, and while the results
aren't perfect, they were legible. I used nothing fancy--a 4MP camera,
hand-held (though I'm looking at ways to brace it, since shaky images
are my biggest problem), a table lamp for light, manual exposure,
auto-focus (set on close-up mode).

I'm taking notes of what works and what doesn't, with an eye toward
assembling a guide to photographing books for OCR (if such a thing
doesn't already exist).

I'll say up front that I've not been able to get flash photography to
work. The page washes out completely.

Geoff
From nwolcott at dsdial.net  Fri Apr  8 18:21:33 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Sun Apr 10 08:19:06 2005
Subject: [gutvol-d] More PG spam being spread around
References: <000a01c5352c$9790a780$0c9495ce@gw98>
	<20050404032054.GA6488@pglaf.org>
Message-ID: <000d01c53de0$9cf68f20$8e9495ce@gw98>

I've published 2 books of Jules Verne on LuLu. I think it is great for value
added stuff; I added the French to one of the books, as far as I know it is
the first dual language Verne text since the 1920's. The pictures I have
inserted came out fairly well although a little improvement is needed, but
they have not been reprinted since 1885. . 600 dpi does not make for the
greatest illustrations. In any event this type of publshing is far superior
to the Fredonia product, and if sold near cost as they are for $6-$8 are a
much better muy than the Fredonia product at $20-$35 and others even higher.
I think Lulu has found a nice niche market.
----- Original Message -----
From: "Greg Newby" <gbnewby@pglaf.org>
To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org>
Sent: Sunday, April 03, 2005 11:20 PM
Subject: Re: [gutvol-d] More PG spam being spread around


> On Wed, Mar 30, 2005 at 08:29:50AM -0500, N Wolcott wrote:
> > Resellers of PG books have taken on a new target, Lulu.com.
>
> Thanks for sending this info, Norm.  I talked with Lulu
> a couple of years ago, and have been in recent communication
> with them.  I think the titles you found are just early
> experiments (though not hidden from public view, as they
> should be).
>
> I've told Lulu that our items must be clearly indicated
> as public domain.  If we can agree on things (which seems
> likely - they've been pretty open to discussion, so far),
> I see POD as a viable option for people who want our stuff.
> Maybe a "buy it now" link in our catalog hits...maybe some
> other way of allowing people who want print to get it.
>
> Compared to other publishers and book resellers I've worked
> interacted with over the years, Lulu is unusually receptive
> to our demands and tendencies.  This is probably because
> they have such diversity in their author base.
>
> Specific advice or demands or qualifications on how Lulu
> might sell our stuff would be welcome.  It looks like they're
> using some templates for our content, so we could add a lot
> of general boilerplate.  Plus, as I said, a clear statement
> that the whole book (cover to cover, including the covers)
> is public domain, and suitable for unlimited redistribution.
>   -- Greg
>
> > Lulu offers POD publishing at zero up front cost, thus luring those who
find free advertising for their spam. The postings I have seen so far both
imply PG and Lulu are supporting thier spam. They advertise the quality of
their texts as being from PG. One ofthem admits there may be errors.
> >
> > There is probably nothing for PG to do except to get Lulu to take the PG
off their customer's postings. If they want to host 15000 books on their
computers for free that is their business. I quote my post to the LuLu
foruml I have posted 2 books to Lulu at 15 cent royalty with added content
to the PG text and I do not mention PG in the blurb. My "quality" book may
soon be submerged in a flood of lulu spam.
> >
> > Posting follows:
> > -------------------------
> > Lulu offers a good service for self publishers who provide "content
added" material. This offers the publisher to continually upgrade the
product until it is in final form then market it through Lulu's various
mechanisms.
> >
> > However recently public domain texts lifted from project gutenberg have
been appearing on Lulu. The accomopanying blurb states that www.lulu.com and
Project gutenberg have joined forces to offer you these long out of print
books. The implication is that somehow Lulu and PG are supporting this
effort. PG is trademarked and there is no right to use the name in
advertising; enforcing the trademark is another thin however for an all
volunteer organization.
> >
> > Software exists to move PG texts to a number of formats, ipod, ebook,
etc including Lulu. So there is a real possibillity that most of the 15000
pg books could end up being hosted on Lulu. No review copy would ever be
required, so the posting for the converter would be free. Lulu could end up
hosting the entire pg corpus for free in a kind of publishing spam. The
books are listed with a royalty of $1 to $2.  One is published with a $1.59
royalty, and claims that $1 will be contriputed to PG of every book sold.
This leaves only 27 cents for the seller.
> >
> > Iin one case the publisher had re-copyrighted the book and in the other
had listed it as Public Domain. Nothing wrong with this, but the copyright
only applies to "new material" and certainly not the entire book. In one
case a ISBN number was listed, so Lulu might have gotten some revenue from
that if the ISBN is real. One of the books was listed as 5000 in sales, so I
imagine that is how many Lulu has in its archive. It may soon get 14,999
more!
> >
> > Another feature with Lulu is you never know who is selling the book.
Lulu distributes it, but the real seller is someone else, unknown. This may
raise legal issues about ultimate responsibilitiy.
> >
> > People like myself who provide added content at no or minimal royalty
will be unhappy to see our listing efforts buried in an avalanche of Lulu
spam. At the very least Lulu should require permission before violating
trademark laws.
> >
> > To see the books in this post, search for "Verne" on Lulu.
> >
> > The additional cost of hosting all these books could end up in forcing
up front charges on Lulu providers or radically restructuring the way Lulu
operates,  neither of which is desirable in my humble opinion.
> >
> > I mention this as a discussion topic, as I feel it is an emerging
problem.
> >
> > ---------------------
> > N Wolcott  nwolcott2@post.harvard.edu
>
> > _______________________________________________
> > gutvol-d mailing list
> > gutvol-d@lists.pglaf.org
> > http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>

From marcello at perathoner.de  Mon Apr 11 16:07:31 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Mon Apr 11 16:07:46 2005
Subject: [gutvol-d] Removal of /public/html/gutenberg
Message-ID: <425B0333.7080509@perathoner.de>

The old directory /public/html/gutenberg on ibiblio will be removed on 
April 14.

If you did edit files in the old directory after Feb 23 you may have to 
copy them to the new directory /public/vhost/g/gutenberg/html .


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Fri Apr 15 00:09:45 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Apr 15 00:10:21 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <1c7.269b00b5.2f90c2b9@aol.com>


last week, amazon bought booksurge, a p.o.d. company.

yesterday, amazon bought mobipocket, a viewer-app.
(franklin, who owned a lot of mobipocket stock, is 
going to make a couple million bucks from the deal.)

do you people here realize what all this means?

no, of course you don't.

-bowerbird
From brandon at corruptedtruth.com  Fri Apr 15 00:38:19 2005
From: brandon at corruptedtruth.com (Brandon Galbraith)
Date: Fri Apr 15 00:38:55 2005
Subject: [gutvol-d] do you realize what this means?
In-Reply-To: <1c7.269b00b5.2f90c2b9@aol.com>
References: <1c7.269b00b5.2f90c2b9@aol.com>
Message-ID: <425F6F6B.8010405@corruptedtruth.com>

Bowerbird,

I would assume this would mean that they're pushing forward into ebooks. 
Ebooks that are under copyright. Looks like a good thing though. The 
more people who have the ability to read ebooks which are under 
copyright, the more people who can read all ebooks (read: PG ebooks). 
Project Gutenberg continues pushing forward, amassing more material as 
said material falls into the public domain. Future books won't need as 
much effort from us then, since all we'll need to do then is provide 
server space (they'll already be in electronic format). In the end, PG 
goes from being an organization converting physical books to electronics 
ones into an organization that ensures the duration of Copyright remains 
a sane amount (not that it's sane currently, but you get the picture). 
How is this a bad thing?

-brandon

Bowerbird@aol.com wrote:

>last week, amazon bought booksurge, a p.o.d. company.
>
>yesterday, amazon bought mobipocket, a viewer-app.
>(franklin, who owned a lot of mobipocket stock, is 
>going to make a couple million bucks from the deal.)
>
>do you people here realize what all this means?
>
>no, of course you don't.
>
>-bowerbird
>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
>
>  
>


From davedoty at hotmail.com  Fri Apr 15 06:16:19 2005
From: davedoty at hotmail.com (Dave Doty)
Date: Fri Apr 15 06:16:22 2005
Subject: [gutvol-d] do you realize what this means?
In-Reply-To: <1c7.269b00b5.2f90c2b9@aol.com>
Message-ID: <BAY101-F1218AB12BEFEDA562FC03EDF360@phx.gbl>

>From: Bowerbird@aol.com

>do you people here realize what all this means?
>
>no, of course you don't.

Why are you still here, if you hate us all so much and hold us in such 
contempt?  I've lost count of the times you've quit the list forever to go 
do it right somewhere else.  Whatever happened to that?

Dave Doty


From cannona at fireantproductions.com  Fri Apr 15 10:38:45 2005
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Fri Apr 15 10:40:02 2005
Subject: [gutvol-d] do you realize what this means?
In-Reply-To: <1c7.269b00b5.2f90c2b9@aol.com>
References: <1c7.269b00b5.2f90c2b9@aol.com>
Message-ID: <6.1.2.0.0.20050415121703.01d276d8@mail.fireantproductions.com>

At 02:09 AM 4/15/2005, you wrote:


<snip>


>do you people here realize what all this means?
>
>no, of course you don't.


No!  Of course we don't!  But I'm sure we can look forward to a long string 
of randomly punctuated messages where you tell us how completely stupid we 
are, followed by several messages which say something along the lines of, 
"I've had enough of this list.  I'm going to unsubscribe soon."  Proceeding 
this will come one or more ads for your blog, followed by some more "no 
really, I'm leaving this time, really!" messages.  Then, finally, if we're 
extremely lucky, we'll get about 60 days of peace until you come back and 
start the whole thing over again.

Those of you who are new to this list might think I'm exaggerating.  Just 
stick around for a few weeks, you'll see that I'm not.

Aaron Cannon


--
E-mail: cannona@fireantproductions.com
Skype: cannona
MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail 
address.)  


From Bowerbird at aol.com  Fri Apr 15 11:35:28 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Apr 15 11:35:53 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <1f5.7c89dab.2f916370@aol.com>

aaron said:
>   No!  Of course we don't!  But I'm sure we can look forward to 
>   a long string of randomly punctuated messages where you 
>   tell us how completely stupid we are, followed by 
>   several messages which say something along the lines of,
>   "I've had enough of this list.  I'm going to unsubscribe soon."  
>   Proceeding this will come one or more ads for your blog, 
>   followed by some more "no really, I'm leaving this time, really!" 
>   messages.  Then, finally, if we're extremely lucky, we'll get about 
>   60 days of peace until you come back and start the whole thing over 
>   again.

randomly punctuated messages?

-bowerbird
From joshua at hutchinson.net  Fri Apr 15 12:01:02 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Fri Apr 15 12:00:36 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <20050415190102.BBA502F9FC@ws6-3.us4.outblaze.com>


----- Original Message -----
From: Bowerbird@aol.com
> 
> aaron said:
> >   No!  Of course we don't!  But I'm sure we can look forward to   a long 
> > string of randomly punctuated messages where you   tell us how completely 
> > stupid we are, followed by   several messages which say something along the 
> > lines of,
> >   "I've had enough of this list.  I'm going to unsubscribe soon."    
> > Proceeding this will come one or more ads for your blog,   followed by some 
> > more "no really, I'm leaving this time, really!"   messages.  Then, finally, 
> > if we're extremely lucky, we'll get about   60 days of peace until you come 
> > back and start the whole thing over   again.
> 
> randomly punctuated messages?
> 

I'm sure that in your madness you see a method to your punctuation....

But to the sane, it looks pretty random.

That, and it is obvious that your shift key was broken long, long ago.

Josh
From geoff.horton at gmail.com  Fri Apr 15 12:02:41 2005
From: geoff.horton at gmail.com (Geoff Horton)
Date: Fri Apr 15 12:02:49 2005
Subject: [gutvol-d] do you realize what this means?
In-Reply-To: <20050415190102.BBA502F9FC@ws6-3.us4.outblaze.com>
References: <20050415190102.BBA502F9FC@ws6-3.us4.outblaze.com>
Message-ID: <94e5f59605041512024943e243@mail.gmail.com>

Can you all conduct your flame war in someone else's inbox, please?
Regardless of how irritating some of us find others of us, it's not
germane to getting material on PG, and that's what I'm here for.
From joshua at hutchinson.net  Fri Apr 15 12:08:02 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Fri Apr 15 12:07:31 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <20050415190802.F39149E8CC@ws6-2.us4.outblaze.com>

Boy, you must be new around here!  Three messages would barely qualify as a border skirmish in the epic flame wars bowerbird has started.

That being said, I do apologize and promise not to rise to the bait again this cycle.

Josh


----- Original Message -----
From: "Geoff Horton" <geoff.horton@gmail.com>
> 
> Can you all conduct your flame war in someone else's inbox, please?
> Regardless of how irritating some of us find others of us, it's not
> germane to getting material on PG, and that's what I'm here for.
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From Bowerbird at aol.com  Fri Apr 15 13:25:19 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Fri Apr 15 13:25:41 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <7a.7170d03f.2f917d2f@aol.com>

brandon said:
>   How is this a bad thing?

um, are you so naive as to "what this means"
that you think i believe it to be "a bad thing"?

that's kind of sad, in a humorous type of way...          :+)

***

dave said:
>   Why 

well, i fully understand your impulse to strike out at me.
i've questioned your competence and intelligence, again,
and that's never a nice thing to hear.  the reason i do it
anyway is because _i_care_ about electronic-books, and
those of us who care about e-books and who _do_ realize
what's up need to tell you to take your head out of your ass.

but you are right in that i don't think enough of you to do
any more than that, not any longer, does that surprise you?
will this turn into another long flamewar?  not a chance.
with progress moving so fast, i can't waste my time here.
just thought i'd drop a note here about recent developments.

but, just to be clear, because you seem to be confused,
i will never leave this listserve _entirely_, but instead
will return here on occasion to give y'all wake-up calls...

forewarned is forearmed; acquaint yourself with your delete key
if you really feel a strong need to protect your undisturbed sleep.
heck, even better, apply some backchannel pressure to greg newby
to "moderate" me -- again -- and absolve me of this onerous duty...


>   Whatever happened to that?

good question, dave, thanks for asking it...          :+)

you can join the beta-test for my viewer.  but you knew that.

i don't feel the need to show you all my cards until you guys
have thrown _considerably_ more chips into the pot first;
when i leave this table, i will take your whole stake with me.

specifically, you'll need to:
1.  put the time and energy into developing your markup format.
2.  do the work of developing procedures to apply that markup.
3.  pay the price in volunteers by implementing the procedures.
4.  put the time and energy into developing conversion routines.
5.  mark up the 20,000+ e-text "backlog" of the current library.

once you have done all _that_, i'll be happy to show you exactly
how you could've gotten the same benefits _without_ markup...

but not _until_ then...          :+)

and so far, you haven't even gotten past step 1 yet...

so _y'all_ don't have any time to waste in a flamewar either.
thus i suggest we just let this thread die, and rest in peace...

finally, i dunno what _you_ might be thinking, dave, 
but i'm thinking that my strategy of sitting back and
waiting for the marketplace to peg a _big_ number on 
the value of a current "best-of-breed" viewer-program
has just proven itself with a _very_ worthy dividend...

-bowerbird
From Gutenberg9443 at aol.com  Fri Apr 15 14:00:31 2005
From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com)
Date: Fri Apr 15 14:00:46 2005
Subject: [gutvol-d] do you realize what this means?
Message-ID: <ea.678b91b7.2f91856f@aol.com>

 
In a message dated 4/15/2005 1:10:47 AM Mountain Standard Time,  
Bowerbird@aol.com writes:

no, of  course you don't.


Yes. It means you almost certainly have Asperger's Disorder. Get in touch  
with a psychiatrist specializing in forms of autism and arrange to be evaluated  
for it. Unfortunately there's no cure for it, which is why I seem to wind up 
in  so many flame wars that I never MEANT to get into, but once you know you 
have it  you can occasionally remember to ask yourself, "Do I really want to 
say this, or  does Asperger's want to say this?" Of course you MIGHT have 
paranoia with or  without schizophrenia, or schizophrenia with or without paranoia, 
or you MIGHT  have bipolar depression, but it's really hard to believe that 
you could succeed  in being this obnoxious without SOME kind of mental illness. 
It definitely  looks to me like autism or a related personality disorder, 
though.
 
By the way, I OWN an e-publishing house. I distribute on FictionWise.com  and 
on eBookWise.com, and I DON'T republish anything from Gutenberg. I feel  very 
strongly that selling something that is easily available free is  
counterproductive and somewhat dishonest. I did post one thing on Gutenberg  several 
years ago, with a note that it was copyright and could be used only for  nonprofit 
purposes. That one I have now posted for sale, but I already owned the  
copyright. 
 
Ebooks are like videos: you can buy them, or rent them, or check them out  
from a library for free. You're seeing a war where there isn't one. Television  
didn't kill radio. Videos and, now, DVDs didn't kill television. E-book stores 
 aren't going to kill Gutenberg. Your thinking is much too narrow; that's 
another  reason I'm suspecting autism of some sort.
 
That is all I have to say on this topic. I just came on to check emails. I  
have pneumonia and I am going back to bed now. Now go away until after you've  
seen a good psychiatrist. As it might take several months for you to get in, 
and  then another six months before the psychiatrist is able to figure out what 
 combination of meds will help you the most (I changed meds every two weeks 
for  eight months), everybody else can have a nice long rest.
 

Anne

Do you like to  breathe?
Then save the trees! 
Begin a personal relationship
with an  ebook 
TODAY!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050415/abe085a9/attachment.html
From bowerbird at aol.com  Fri Apr 15 16:40:14 2005
From: bowerbird at aol.com (bowerbird@aol.com)
Date: Fri Apr 15 16:40:40 2005
Subject: [gutvol-d] do you realize what this means?
In-Reply-To: <ea.678b91b7.2f91856f@aol.com>
References: <ea.678b91b7.2f91856f@aol.com>
Message-ID: <8C7102B9B68B0D0-CF8-F92B@mblk-r02.sysops.aol.com>

anne said:
> Asperger's Disorder

anne, i've got to hand it to you! :+)

your post is humorous in a humorous type of way!

most entertaining thing i've read all week, and this
has been an exquisitely entertaining week for me!

let's hope everyone else is smart enough to realize
that this is the _perfect_ way to end this thread!

meanwhile, i hope you shake that pneumonia soon!

-bowerbird
From benbradley at frontiernet.net  Sat Apr 16 20:33:13 2005
From: benbradley at frontiernet.net (Ben Bradley)
Date: Sat Apr 16 20:33:36 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
Message-ID: <4261D8F9.4000805@frontiernet.net>

    It looks like a lively, lightly-moderated list here. As a long-term 
participant on Usenet as well as many email lists, this is nothing new 
to me and I feel perfectly welcome. :)

    My first question involves the book "On The Sensations of Tone" by
Herman Helmholtz. I recently saw mention of it online and decided that 
since its copyright has long expired I might find it on the GP site. I 
don't see it, nor did I find the text elsewhere online, but further 
investigation (reading much of the PG FAQ) led me to this page with list 
of books in progress:

http://www.dprice48.freeserve.co.uk/GutIP.html

    The relevant portion of that webpage is:

Helmholtz, Hermann Ludwig Ferdinand von (31aug1821-8sep1894)
The Mystery of Creation - Copyright cleared 23 Nov 1997
On the Sensations of Tone as a Physiological Basis for the Theory of
Music - Copyright cleared 17 Sep 2003

    How do I find the "real" status of this book, whose copyright was 
cleared seven months ago but is not yet online? If it's not actively
being converted to text by someone else, I'd like to do it (as soon as I 
get my own physical copy). Apparently, I should email Mr. Price at the 
address indicated on that webpage to see what he might know, but I'd 
also like to know if I'm missing something in relation to this.

    Second question: Why are there two copies of Thomas Paine's "Common
Sense" on the GP site?

    And a third: I just came across yet another copyrighted book online. 
All such books I've seen on the web are apparently put online legally by 
the author or with the author's permission. Is there a list of these 
books, perhaps as a part of an index of "all books online"? And yes, I 
know this is fairly tangential to the purpose of the GP, as most such 
authors generally choose to retain full copyright and have their books 
available ONLY on their sites, so they can have some control, such as 
using the site to advertise/selling physical copies of the book.

From jtinsley at pobox.com  Sat Apr 16 21:03:38 2005
From: jtinsley at pobox.com (Jim Tinsley)
Date: Sat Apr 16 21:03:56 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
In-Reply-To: <4261D8F9.4000805@frontiernet.net>
References: <4261D8F9.4000805@frontiernet.net>
Message-ID: <20050417040338.GA19736@panix.com>

On Sat, Apr 16, 2005 at 11:33:13PM -0400, Ben Bradley wrote:
>   It looks like a lively, lightly-moderated list here. As a long-term 
>participant on Usenet as well as many email lists, this is nothing new 
>to me and I feel perfectly welcome. :)
>

As a denizen of Usenet, you may feel even more at home in a while. :-)

>   My first question involves the book "On The Sensations of Tone" by
>Herman Helmholtz. I recently saw mention of it online and decided that 
>since its copyright has long expired I might find it on the GP site. I 
>don't see it, nor did I find the text elsewhere online, but further 
>investigation (reading much of the PG FAQ) led me to this page with list 
>of books in progress:
>
>http://www.dprice48.freeserve.co.uk/GutIP.html
>
>   The relevant portion of that webpage is:
>
>Helmholtz, Hermann Ludwig Ferdinand von (31aug1821-8sep1894)
>The Mystery of Creation - Copyright cleared 23 Nov 1997
>On the Sensations of Tone as a Physiological Basis for the Theory of
>Music - Copyright cleared 17 Sep 2003
>
>   How do I find the "real" status of this book, whose copyright was 
>cleared seven months ago but is not yet online? If it's not actively
>being converted to text by someone else, I'd like to do it (as soon as I 
>get my own physical copy). Apparently, I should email Mr. Price at the 
>address indicated on that webpage to see what he might know, but I'd 
>also like to know if I'm missing something in relation to this.

Nope. Not missing anything. Somebody's got it. They may or may not
be doing anything with it. It's not queued up at DP. But seven
months ain't nuthin' much.


>
>   Second question: Why are there two copies of Thomas Paine's "Common
>Sense" on the GP site?

Same reason there are three Grimms, three Odysseys, three Iliads
in English and one in French, seven or eight of Hamlet, two Valley
of Fears, two Literary Tastes, and yadda-yadda. FAQ V.32 and R.36.
Basically, if it comes from a different paper edition, we post it
separately, and give it a new number.

And BTW, it's always "PG", not "GP".


>   And a third: I just came across yet another copyrighted book online. 
>All such books I've seen on the web are apparently put online legally by 
>the author or with the author's permission. Is there a list of these 
>books, perhaps as a part of an index of "all books online"? 

Not that I'm aware of. You can quickly get the list of PG-posted
copyrighted works (not necessarily titles) from GUTINDEX.ALL,
but people put all kinds of stuff online and call it copyrighted
but available for private use, and they don't necessarily 
register it with us or anyone else.

jim

From cweyant at twcny.rr.com  Sun Apr 17 04:56:43 2005
From: cweyant at twcny.rr.com (Curtis A. Weyant)
Date: Sun Apr 17 04:59:49 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
In-Reply-To: <4261D8F9.4000805@frontiernet.net>
References: <4261D8F9.4000805@frontiernet.net>
Message-ID: <42624EFB.3040900@twcny.rr.com>

Ben Bradley wrote:
> 
>    And a third: I just came across yet another copyrighted book online.
> All such books I've seen on the web are apparently put online legally by
> the author or with the author's permission. Is there a list of these
> books, perhaps as a part of an index of "all books online"?

The Online Books Page (http://onlinebooks.library.upenn.edu/) lists both
public domain and copyrighted books online. I'm sure it doesn't list ALL
books, but it will help you find a good many.

Curtis.
From jmk at his.com  Sun Apr 17 05:16:30 2005
From: jmk at his.com (Janet Kegg)
Date: Sun Apr 17 05:16:41 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
In-Reply-To: <4261D8F9.4000805@frontiernet.net>
References: <4261D8F9.4000805@frontiernet.net>
Message-ID: <68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>

On Sat, 16 Apr 2005 23:33:13 -0400, you wrote:

 
>http://www.dprice48.freeserve.co.uk/GutIP.html
>
>    The relevant portion of that webpage is:
>
>Helmholtz, Hermann Ludwig Ferdinand von (31aug1821-8sep1894)
>The Mystery of Creation - Copyright cleared 23 Nov 1997
>On the Sensations of Tone as a Physiological Basis for the Theory of
>Music - Copyright cleared 17 Sep 2003
>
>    How do I find the "real" status of this book, whose copyright was 
>cleared seven months ago but is not yet online? If it's not actively
>being converted to text by someone else, I'd like to do it (as soon as I 
>get my own physical copy). Apparently, I should email Mr. Price at the 
>address indicated on that webpage to see what he might know, but I'd 
>also like to know if I'm missing something in relation to this.

Er, "cleared 17 Sep 2003" makes the clearance 19 months old, not 7.
Would the book likely contain musical notation?  If so, that challenge
might account for it's sitting in someone's to-do pile for so long.

I'd advise you to go ahead and do it yourself since the clearance is
so stale.  If you haven't already, you might want to wander over to
Distributed Proofreaders (www.pgdp.net)--an inquiry about the
Helmholtz book on DP's Content Providers forum might prove useful.

-- Janet  
 
From joshua at hutchinson.net  Sun Apr 17 10:57:19 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Sun Apr 17 10:56:17 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
In-Reply-To: <68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>
References: <4261D8F9.4000805@frontiernet.net>
	<68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>
Message-ID: <4262A37F.5020109@hutchinson.net>

Janet Kegg wrote:

>Er, "cleared 17 Sep 2003" makes the clearance 19 months old, not 7.
>Would the book likely contain musical notation?  If so, that challenge
>might account for it's sitting in someone's to-do pile for so long.
>
>I'd advise you to go ahead and do it yourself since the clearance is
>so stale.  If you haven't already, you might want to wander over to
>Distributed Proofreaders (www.pgdp.net)--an inquiry about the
>Helmholtz book on DP's Content Providers forum might prove useful.
>
>-- Janet  
>  
>
Also, an e-mail to David will usually result in him double-checking the 
status with the original person (he has access to that information).  
They may have disappeared from the face of the earth or just cleared it 
and then never got around to it... If it is an especially hard work, 
they may still be working on it.  David can help you find out the 
current situation.

Josh
From Gutenberg9443 at aol.com  Sun Apr 17 11:01:58 2005
From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com)
Date: Sun Apr 17 11:02:09 2005
Subject: [gutvol-d] Questions on  GP and other ebooks
Message-ID: <dd.24c127c7.2f93fe96@aol.com>

 
In a message dated 4/17/2005 5:59:55 AM Mountain Standard Time,  
cweyant@twcny.rr.com writes:

Ben  Bradley wrote:
> 
>    And a third: I just came across  yet another copyrighted book online.
> All such books I've seen on the  web are apparently put online legally by
> the author or with the  author's permission.


Unfortunately, that is not correct. An incredible number of pirated  books, 
reasonably old or brand new, show up online. PG tries its best to  avoid this 
practice, and I THINK all the in-copyright books on PG are legal, but  other 
than that, the only way to be sure you're getting legal books is to go to  the 
author's or copyright owner's Website or to go to a commercial ebook  
distributor such as FictionWise. BTW, a copyrighted book on which I own the  copyright 
showed up on a commercial site three days after it was legally posted  on PG.
 
Anne

Do you like to  breathe?
Then save the trees! 
Begin a personal relationship
with an  ebook 
TODAY!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050417/0a70caeb/attachment.html
From collin at xs4all.nl  Sun Apr 17 12:33:41 2005
From: collin at xs4all.nl (Branko Collin)
Date: Sun Apr 17 12:21:55 2005
Subject: [gutvol-d] Reply comments on US issues raised by "orphan works"
Message-ID: <4262D635.14434.13F7655@localhost>


>From what I understand, no PG volunteer (not counting John Mark 
Ockerbloom) has posted a comment to the US Copyright Office's RFC 
about "orphan works".

So now US copyright policy will be guided by comments such as Kristie 
Hubler's: "If people have access to my work without paying for it, 
and using it just to make a buck, it would be as if I were being 
raped, or having a child I bore ripped out of my arms, never to be 
seen again."

It may very well be possible that PG volunteers have no opinion about 
orphan works. In that case, consider this e-mail message not sent. 

If you are concerned about orphan works, now is your last chance to 
be heard. You can send reply comments to the US Copyright Office 
until May 5, 2005. The rules are explained here: 
<http://www.copyright.gov/orphan/comments/>

Initially some 700 comments were submitted. It's hard to read through 
all of them, so I suggest Googling for abstracts, looking for the 
usual suspects et cetera.

There's an unused Wiki running at 
<http://www.gutenberg.org/cgi-bin/wiki-newsletter.cgi>. I suggest you 
use that for sharing notes. 

Do not wait for an official PG position statement. Official 
statements are hard to draft, because they require consensus. Also, 
official statements tend to sound impersonal, exactly because they 
represent a consensus position. There is nothing wrong with sounding 
like you actually care (although I would leave the Hubler-style 
hysterics at home).

-- 
branko collin
collin@xs4all.nl
From donovan at abs.net  Sun Apr 17 12:32:34 2005
From: donovan at abs.net (D Garcia)
Date: Sun Apr 17 12:31:17 2005
Subject: [gutvol-d] Questions on  PG and other ebooks
In-Reply-To: <68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>
References: <4261D8F9.4000805@frontiernet.net>
	<68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>
Message-ID: <200504171532.34484.donovan@abs.net>

On Sunday 17 April 2005 08:16 am, Janet Kegg wrote:

> Er, "cleared 17 Sep 2003" makes the clearance 19 months old, not 7.
> Would the book likely contain musical notation?  If so, that challenge
> might account for it's sitting in someone's to-do pile for so long.
>
> I'd advise you to go ahead and do it yourself since the clearance is
> so stale.  If you haven't already, you might want to wander over to
> Distributed Proofreaders (www.pgdp.net)--an inquiry about the
> Helmholtz book on DP's Content Providers forum might prove useful.

I wouldn't call that clearance "stale" exactly .. Juliet Sutherland for 
example has a huge number of clearances from 2003-ish that will eventually be 
scanned and processed. The 1997 one though I would definitely proceed on.

Your point about those being more difficult or delayed due to musical content 
is certainly valid, though.

Cheers!
From prosfilaes at gmail.com  Sun Apr 17 21:32:05 2005
From: prosfilaes at gmail.com (David Starner)
Date: Sun Apr 17 21:32:23 2005
Subject: [gutvol-d] Questions on GP and other ebooks
In-Reply-To: <4262A37F.5020109@hutchinson.net>
References: <4261D8F9.4000805@frontiernet.net>
	<68k4615p9ouq8ls4km827mptereulu95ts@4ax.com>
	<4262A37F.5020109@hutchinson.net>
Message-ID: <6d99d1fd05041721326f39efb7@mail.gmail.com>

On 4/17/05, Joshua Hutchinson <joshua@hutchinson.net> wrote:

> Also, an e-mail to David will usually result in him double-checking the
> status with the original person (he has access to that information).
> They may have disappeared from the face of the earth or just cleared it
> and then never got around to it... If it is an especially hard work,
> they may still be working on it.  David can help you find out the
> current situation.

It's a good idea in any case. There are not particularly hard books
that have got stuck in post-proofing at DP, or elsewhere, that there's
no reason to redo the work that's already been done on them.
From donovan at abs.net  Mon Apr 18 15:21:51 2005
From: donovan at abs.net (D Garcia)
Date: Mon Apr 18 15:20:38 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <6d99d1fd05041721326f39efb7@mail.gmail.com>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
Message-ID: <200504181821.51488.donovan@abs.net>

On Monday 18 April 2005 12:32 am, David Starner wrote:
> It's a good idea in any case. There are not particularly hard books
> that have got stuck in post-proofing at DP, or elsewhere, that there's
> no reason to redo the work that's already been done on them.

Except that many of those "stuck" books are waiting on missing pages/images, 
etc.

You may find that instead of redoing a book all on your own, you can be the 
person that provides that one last missing piece to allow the existing but 
incomplete work to be finished.
From hart at pglaf.org  Tue Apr 19 09:40:46 2005
From: hart at pglaf.org (Michael Hart)
Date: Tue Apr 19 09:40:48 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <200504181821.51488.donovan@abs.net>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
Message-ID: <Pine.LNX.4.60.0504190939530.23081@pglaf.org>


On Mon, 18 Apr 2005, D Garcia wrote:

> On Monday 18 April 2005 12:32 am, David Starner wrote:
>> It's a good idea in any case. There are not particularly hard books
>> that have got stuck in post-proofing at DP, or elsewhere, that there's
>> no reason to redo the work that's already been done on them.
>
> Except that many of those "stuck" books are waiting on missing pages/images,
> etc.

Any reason not to post them with a comment that these pages are missing?

Readers would thus be encouraged to help find the missing pages.


Michael

From sly at victoria.tc.ca  Tue Apr 19 10:43:21 2005
From: sly at victoria.tc.ca (Andrew Sly)
Date: Tue Apr 19 10:43:28 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504190939530.23081@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
Message-ID: <Pine.GSO.4.58.0504191038120.12377@vtn1.victoria.tc.ca>


On Tue, 19 Apr 2005, Michael Hart wrote:

> On Mon, 18 Apr 2005, D Garcia wrote:
>
> > On Monday 18 April 2005 12:32 am, David Starner wrote:
> >> It's a good idea in any case. There are not particularly hard books
> >> that have got stuck in post-proofing at DP, or elsewhere, that there's
> >> no reason to redo the work that's already been done on them.
> >
> > Except that many of those "stuck" books are waiting on missing pages/images,
> > etc.
>
> Any reason not to post them with a comment that these pages are missing?
>
> Readers would thus be encouraged to help find the missing pages.
>
>

Or, another possibility, given that many people don't look to closely,
would be that someone has a copy of the book, sees that it is already in
PG, and then moves on to something else...


Andrew
From fvandrog at scripps.edu  Tue Apr 19 10:50:04 2005
From: fvandrog at scripps.edu (Frank van Drogen)
Date: Tue Apr 19 10:50:11 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504190939530.23081@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
Message-ID: <6.2.0.8.0.20050419104217.01edb008@mail.scripps.edu>


>>Except that many of those "stuck" books are waiting on missing pages/images,
>>etc.
>
>Any reason not to post them with a comment that these pages are missing?
>
>Readers would thus be encouraged to help find the missing pages.


I think it would be a good idea to ask the audience of PG to help looking 
for missing pages. This would certainly increase the chance of recovering 
them, and us to post complete books.

However, I am strongly opposed to posting incomplete books. If the public 
can not be sure whether the books they download are complete or not, they 
will move on to a place where quality can be guaranteed. I think in these 
kind of issues quality should prevail above quantity.

Kind regards,

Frank 

From bruce at zuhause.org  Tue Apr 19 13:03:20 2005
From: bruce at zuhause.org (Bruce Albrecht)
Date: Tue Apr 19 13:03:31 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504190939530.23081@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
Message-ID: <16997.25608.758628.394332@celery.zuhause.org>

Michael Hart writes:
 > 
 > On Mon, 18 Apr 2005, D Garcia wrote:
 > 
 > > On Monday 18 April 2005 12:32 am, David Starner wrote:
 > >> It's a good idea in any case. There are not particularly hard books
 > >> that have got stuck in post-proofing at DP, or elsewhere, that there's
 > >> no reason to redo the work that's already been done on them.
 > >
 > > Except that many of those "stuck" books are waiting on missing pages/images,
 > > etc.
 > 
 > Any reason not to post them with a comment that these pages are missing?
 > 
 > Readers would thus be encouraged to help find the missing pages.

It's the Distributed Proofreader's policy not to post to PG when there
are missing pages.  They have a forum listing projects that are
missing pages so that DP volunteers can see out the missing pages.

Personally, I would not buy books from a publisher with a reputation
for knowingly publishing books with pages missing, nor would I want to
download from PG if it had a reputation for knowingly publishing
etexts that are missing pages.

From phil at hitchcock99.freeserve.co.uk  Tue Apr 19 13:58:58 2005
From: phil at hitchcock99.freeserve.co.uk (Phil Hitchcock)
Date: Tue Apr 19 14:01:00 2005
Subject: [gutvol-d] Questions on PG and other ebooks
References: <4261D8F9.4000805@frontiernet.net><4262A37F.5020109@hutchinson.net><6d99d1fd05041721326f39efb7@mail.gmail.com><200504181821.51488.donovan@abs.net><Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6.2.0.8.0.20050419104217.01edb008@mail.scripps.edu>
Message-ID: <003701c54522$b6cfbf40$f3c2883e@freeserve.co.uk>

> I think it would be a good idea to ask the audience of PG to help looking
> for missing pages. This would certainly increase the chance of recovering
> them, and us to post complete books.
>
> However, I am strongly opposed to posting incomplete books. If the public
> can not be sure whether the books they download are complete or not, they
> will move on to a place where quality can be guaranteed. I think in these
> kind of issues quality should prevail above quantity.


PG already issues books with missing pages, e.g. #11866. However it is
stated at the beginning that certain specified pages are missing, so the
reader knows what to expect. If the currently best available copy of a text,
which may be several hundred years old, is missing a few pages, well that is
unfortunate; but surely it is better to give people the chance to read the
99% that is available. Our great museums do not say, this pot has a few
chips in it so we will not exhibit it.

However, possibly there could be a list of PG works that require pages, so
that there would be a higher chance of someone eventually contributing the
missing pages.

Philip.

From grythumn at gmail.com  Tue Apr 19 14:12:43 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Tue Apr 19 14:15:35 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <003701c54522$b6cfbf40$f3c2883e@freeserve.co.uk>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6.2.0.8.0.20050419104217.01edb008@mail.scripps.edu>
	<003701c54522$b6cfbf40$f3c2883e@freeserve.co.uk>
Message-ID: <15cfa2a5050419141222f2a8ad@mail.gmail.com>

On 4/19/05, Phil Hitchcock <phil@hitchcock99.freeserve.co.uk> wrote:
> PG already issues books with missing pages, e.g. #11866. However it is
> stated at the beginning that certain specified pages are missing, so the
> reader knows what to expect. If the currently best available copy of a text,
> which may be several hundred years old, is missing a few pages, well that is
> unfortunate; but surely it is better to give people the chance to read the
> 99% that is available. Our great museums do not say, this pot has a few
> chips in it so we will not exhibit it.

The main reason to avoid incomplete projects at DP is a lack of
resources, both of skilled people and technical resources. In fact
they go together; the Post Processing backlog at DP is causing a
chronic shortage of disk space. If a project has to sit on the server
for 6 additional months waiting for 2 pages, that is not good. Also,
by posting an incomplete work, you add to already heavy PP work load.

I've got an incomplete project sitting around waiting on two pages..
but someone has already volunteered to take pictures of the missing
pages from the special collection at a nearby university. The existing
system is fairly slow, but it does work in many cases.

Now if something is extremely rare, and all known copies have the same
defect, by all means post it IMO. But otherwise I suggest holding out
for a complete work.

R C
From prosfilaes at gmail.com  Tue Apr 19 14:42:56 2005
From: prosfilaes at gmail.com (David Starner)
Date: Tue Apr 19 14:43:07 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504190939530.23081@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
Message-ID: <6d99d1fd05041914426b0436ac@mail.gmail.com>

On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
> Any reason not to post them with a comment that these pages are missing?
> 
> Readers would thus be encouraged to help find the missing pages.

If you look at book 13921, you'll notice that it's missing pages 98
and 99 ("[Seiten 98 und 99 fehlen!]", embedded in the middle of the
text). How many readers have jumped forward to offer the missing
pages? If it had been kept at DP, we could have found the pages and
added them. But once it's on the shelf, nobody worries about it
anymore.

I've found as a general rule, once a book is posted, the odds of
anything getting done on it drop vastly. It gets moved to the
completed pile, and new books take its place.
From nwolcott at dsdial.net  Wed Apr 20 07:27:21 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Wed Apr 20 07:29:38 2005
Subject: [gutvol-d] New Copyright law
Message-ID: <006b01c545b5$5f083e20$b09495ce@gw98>

What is the impactof the new copyright passed by the house yesterday on the likes of PG, Ockerbloom, and other sites, not to mention the minions. ???
N Wolcott  nwolcott2@post.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050420/450e5411/attachment.html
From davedoty at hotmail.com  Wed Apr 20 08:34:28 2005
From: davedoty at hotmail.com (Dave Doty)
Date: Wed Apr 20 08:34:35 2005
Subject: [gutvol-d] New Copyright law
In-Reply-To: <006b01c545b5$5f083e20$b09495ce@gw98>
Message-ID: <BAY2-F23C265459D1317C2F1C422DF2B0@phx.gbl>

>From: "N Wolcott" <nwolcott@dsdial.net>

>What is the impactof the new copyright passed by the house yesterday on the 
>likes of PG, Ockerbloom, and other sites, not to mention the minions. ???

Since you were'nt more specific, I had to hit the web to try to figure out 
which law you were talking about.  The only thing I could find that seemed 
to fit the bill was this:

http://news.zdnet.com/2100-9588_22-5677232.html

In a nutshell, it allows prison terms of up to three years for possessing 
even a single copy of a work that hasn't been commercially released, 
regardless of whether it has been shared.

It has not a jot of impact on PG, unless we're planning on putting 
prerelease copyrighted works on the site, and I missed that discussion.  
This isn't a "copyright law" in the sense of changing copyright standards in 
any way,  it just increases the penalty for what was already a violation.  
Since PG is extremely conscientious about copyright, this doesn't matter to 
our work (it may or may not matter to individuals.)

It does show a general attitude of clampdown, but I think we were all 
already well aware of that pervasive attitude in virtually all governmental 
bodies these days.

Dave Doty


From hart at pglaf.org  Wed Apr 20 10:05:12 2005
From: hart at pglaf.org (Michael Hart)
Date: Wed Apr 20 10:05:14 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <6d99d1fd05041914426b0436ac@mail.gmail.com>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
Message-ID: <Pine.LNX.4.60.0504201001380.19133@pglaf.org>


On Tue, 19 Apr 2005, David Starner wrote:

> On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
>> Any reason not to post them with a comment that these pages are missing?
>>
>> Readers would thus be encouraged to help find the missing pages.
>
> If you look at book 13921, you'll notice that it's missing pages 98
> and 99 ("[Seiten 98 und 99 fehlen!]", embedded in the middle of the
> text). How many readers have jumped forward to offer the missing
> pages? If it had been kept at DP, we could have found the pages and
> added them. But once it's on the shelf, nobody worries about it
> anymore.
>
> I've found as a general rule, once a book is posted, the odds of
> anything getting done on it drop vastly. It gets moved to the
> completed pile, and new books take its place.

Then I suggest we keep some kind of notice for our own people
that the book is incomplete, rather than simply ignoring books
once they reach the public.

It's not as if there is some "Digital Divide" that prevents us
from trying to improve our eBooks from both directions.

Why use this kind of reasoning to keep these books from seeing
the light of day?


Michael

From distributedmel at gmail.com  Wed Apr 20 10:16:19 2005
From: distributedmel at gmail.com (Melissa Er-Raqabi)
Date: Wed Apr 20 10:16:36 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504201001380.19133@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<Pine.LNX.4.60.0504201001380.19133@pglaf.org>
Message-ID: <a5a2a50205042010161fe30344@mail.gmail.com>

If we are going to post incomplete books, the notice should not be
just for 'our own people' but should be very clearly stated to the
users of PG. Anything less is deceit, in my mind. Such books should go
in a section for incomplete projects, and the end-user's help should
be specifically solicited, creating a partnership with him, rather
than putting him off by trying to pass off a 'broken' project as one
in good condition.

Melissa

On 4/20/05, Michael Hart <hart@pglaf.org> wrote:
> 
> Then I suggest we keep some kind of notice for our own people
> that the book is incomplete, rather than simply ignoring books
> once they reach the public.
> 
> It's not as if there is some "Digital Divide" that prevents us
> from trying to improve our eBooks from both directions.
> 
> Why use this kind of reasoning to keep these books from seeing
> the light of day?
> 
> 
> Michael
> 
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
From jonathan_ingram at yahoo.com  Wed Apr 20 10:22:28 2005
From: jonathan_ingram at yahoo.com (Jonathan Ingram)
Date: Wed Apr 20 10:22:34 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: 6667
Message-ID: <20050420172228.65670.qmail@web41726.mail.yahoo.com>

The current weekly announcement contains the following:

> We have been invited to peruse the various eBook collections
> of the Internet Archive for potential Project Gutenberg eBooks.
> 
> http://www.archive.org
> 
> Don't worry, many of the numbers listed are out of date,
> but you should get all the files when you pass through
> to the original sites.

People on gutvol-d might be interested to know that at DP we're moving through
a particular Internet Archive collection -- the Canadian Libraries archive --
in a relatively organised fashion. Those of you who are already signed up to DP
can follow the progress of this by looking at the appropriate thread in our
'Providing Content' forum. For a summary of our progress, see here:

  http://tinyurl.com/77rj4

Incidentally, this also provides a nice summary of all the content of this
particular Internet Archive collection. As you can see, we're making good
progress, particularly on the English language texts.

If any of you are already working on any of these texts outside DP, please post
a reply to the DP thread, and I will update the information page to reflect the
fact. This will avoid more than one volunteer working on the same text.

-- 
Jon Ingram


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From joshua at hutchinson.net  Wed Apr 20 10:42:31 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Wed Apr 20 10:41:00 2005
Subject: [gutvol-d] Questions on PG and other ebooks
Message-ID: <20050420174231.E03B99E780@ws6-2.us4.outblaze.com>


----- Original Message -----
From: "Michael Hart" <hart@pglaf.org>
> 
> Why use this kind of reasoning to keep these books from seeing
> the light of day?
> 

It is the difference between the ideal and the practical.

Ideally, we could post the incomplete texts and someone would notice the missing pages, find them and add them.

In practice, once it is posted, it takes an act of god for further updates to occur.

At DP, we have a pretty good number of books that get fixed this way.  Once it is posted, that number would virtually disappear.

Josh
From Bowerbird at aol.com  Wed Apr 20 11:18:13 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 11:24:29 2005
Subject: [gutvol-d] New Copyright law
Message-ID: <1b8.119f8c03.2f97f6e5@aol.com>

dave said:
>   It has not a jot of impact on PG

right.


>   It does show a general attitude of clampdown, but 
>   I think we were all already well aware of that 
>   pervasive attitude in virtually all governmental bodies these days.

so we ignore, and then shrug off, evil because it has become "pervasive"?

"first they came for ___, but i did nothing because i wasn't ___..."

some people can't see the forest for the trees.
other people can't see the forest
because they won't look past the end of their nose...

-bowerbird

p.s.  and in the future, nobody will be able to see the trees
because the corporations chopped 'em all down while we were
busy nailing our eyes shut so we wouldn't see anything "pervasive"...
From the43rdearlofcranberry at yahoo.co.uk  Wed Apr 20 11:40:16 2005
From: the43rdearlofcranberry at yahoo.co.uk (Tom Day)
Date: Wed Apr 20 11:40:25 2005
Subject: [gutvol-d] New Copyright law
In-Reply-To: 6667
Message-ID: <20050420184016.12311.qmail@web26609.mail.ukl.yahoo.com>

I'm new to all this...is there some well-established
reason why you seem to hate everyone on this list so
much, and yet continue to subscribe to it? If I'm
missing something obvious (I mean, I know there are
people who are just born that way) then I apologise.

--- Bowerbird@aol.com wrote:
> dave said:
> >   It has not a jot of impact on PG
> 
> right.
> 
> 
> >   It does show a general attitude of clampdown,
> but 
> >   I think we were all already well aware of that 
> >   pervasive attitude in virtually all governmental
> bodies these days.
> 
> so we ignore, and then shrug off, evil because it
> has become "pervasive"?
> 
> "first they came for ___, but i did nothing because
> i wasn't ___..."
> 
> some people can't see the forest for the trees.
> other people can't see the forest
> because they won't look past the end of their
> nose...
> 
> -bowerbird
> 
> p.s.  and in the future, nobody will be able to see
> the trees
> because the corporations chopped 'em all down while
> we were
> busy nailing our eyes shut so we wouldn't see
> anything "pervasive"...
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
> 

Send instant messages to your online friends http://uk.messenger.yahoo.com 
From ian at babcockbrown.com  Wed Apr 20 11:07:17 2005
From: ian at babcockbrown.com (Ian Stoba)
Date: Wed Apr 20 12:17:50 2005
Subject: [gutvol-d] New Copyright law
In-Reply-To: <BAY2-F23C265459D1317C2F1C422DF2B0@phx.gbl>
References: <BAY2-F23C265459D1317C2F1C422DF2B0@phx.gbl>
Message-ID: <68d8d0b9d2f3a9f10002ae73dc7295f7@babcockbrown.com>


On Apr 20, 2005, at 8:34 AM, Dave Doty wrote:

>> From: "N Wolcott" <nwolcott@dsdial.net>
>
>> What is the impactof the new copyright passed by the house yesterday 
>> on the likes of PG, Ockerbloom, and other sites, not to mention the 
>> minions. ???
>
> Since you were'nt more specific, I had to hit the web to try to figure 
> out which law you were talking about.  The only thing I could find 
> that seemed to fit the bill was this:
>
> http://news.zdnet.com/2100-9588_22-5677232.html
>

The bill has some other interesting provisions, notably allowing the 
use of third party technology to skip offensive material when a film is 
shown in a home:

http://www.wired.com/news/politics/0,1283,67269,00.html

The use of devices like ClearPlay has been opposed by filmmakers who do 
not want their movies getting bleeped or pixellated. I think the most 
interesting question is whether a device like ClearPlay is in fact a 
circumvention device for DMCA purposes. If so, I think this is the 
first time the DMCA anti-circumvention provisions have been weakened by 
Congress.

I'm not quite sure what to make of this just yet, particularly since it 
raises the penalty substantially for sharing materials that were never 
commercially released. I does make me slightly optimistic to see 
Congress realizing that the DMCA and other extensions of copyright do 
limit how people can use the works they purchase in the way they want.

  
This email message may contain information that is confidential and proprietary to Babcock & Brown or a third party. If you are not the intended recipient, please contact the sender and destroy the original and any copies of the original message. Babcock & Brown takes measures to protect the content of its communications. However, Babcock & Brown cannot guarantee that email messages will not be intercepted by third parties or that email messages will be free of errors or viruses. 

If you do not wish to receive any further e-mail from Babcock & Brown, please send an email to opt-out@babcockbrown.com.
From marcello at perathoner.de  Wed Apr 20 12:41:02 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Apr 20 12:41:14 2005
Subject: [gutvol-d] New Copyright law
In-Reply-To: <20050420184016.12311.qmail@web26609.mail.ukl.yahoo.com>
References: <20050420184016.12311.qmail@web26609.mail.ukl.yahoo.com>
Message-ID: <4266B04E.1010204@perathoner.de>

Tom Day wrote:

> I'm new to all this...is there some well-established reason why you
> seem to hate everyone on this list so much, and yet continue to
> subscribe to it? If I'm missing something obvious (I mean, I know
> there are people who are just born that way) then I apologise.

See "The Showcase of Pudd?nhead Bowerbird" at:

   http://www.gnutenberg.de/bowerbird/


-- 
Marcello Perathoner
webmaster@gutenberg.org

From brandon at corruptedtruth.com  Wed Apr 20 12:44:43 2005
From: brandon at corruptedtruth.com (Brandon Galbraith)
Date: Wed Apr 20 12:44:55 2005
Subject: [gutvol-d] New Copyright law
In-Reply-To: <4266B04E.1010204@perathoner.de>
References: <20050420184016.12311.qmail@web26609.mail.ukl.yahoo.com>
	<4266B04E.1010204@perathoner.de>
Message-ID: <4266B12B.4080007@corruptedtruth.com>

Marcello,

That made my day. Thank you =)

-brandon

Marcello Perathoner wrote:

> Tom Day wrote:
>
>> I'm new to all this...is there some well-established reason why you
>> seem to hate everyone on this list so much, and yet continue to
>> subscribe to it? If I'm missing something obvious (I mean, I know
>> there are people who are just born that way) then I apologise.
>
>
> See "The Showcase of Pudd?nhead Bowerbird" at:
>
>   http://www.gnutenberg.de/bowerbird/
>
>
>
>


From Bowerbird at aol.com  Wed Apr 20 12:59:28 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 12:59:41 2005
Subject: [gutvol-d] Questions on PG and other ebooks
Message-ID: <d2.2720929f.2f980ea0@aol.com>

melissa said:
>   If we are going to post incomplete books, the notice should not be
>   just for 'our own people' but should be very clearly stated to the
>   users of PG. 

i agree.  it should be noted right there in the e-text itself.
which, in the sole example given, was precisely the case...


>   Anything less is deceit, in my mind. 

well, "deceit" is a pretty strong word, with ugly implications.
let's save that for where it applies, like with our government.


>   Such books should go in a section for incomplete projects,
>   and the end-user's help should be specifically solicited, 
>   creating a partnership with him

i agree.  and have said as much in the past, several times,
not just for the specific case of incomplete projects, but for
the larger arena of which it is a subset -- error-correction...

i even offered to set up a user-based error-correction system,
which is _desperately_ needed by this ever-growing e-library.

my offer was rebuffed.

but melissa, since you're on this kick today, how about if _you_
offer to set up such a system.  maybe they'll accept your offer.
one task you'd do is to scour the library for incomplete e-texts.


>   rather than putting him off by trying to 
>   pass off a 'broken' project as one in good condition.

i don't believe anyone was ever suggesting that we should try to
"pass off" a "broken" e-text as if it were "one in good condition".

i believe michael was suggesting exactly what you've suggested,
namely that we involve the end-users to help solve the problem.
after all, all of "our own people" are end-user volunteers, right?

the "missing pages wiki" over at distributed proofreaders is a start,
but since that's only known to people at d.p. (and only some of them),
it is no more than a start.  make it bigger, melissa.  take it public...

-bowerbird
From marcello at perathoner.de  Wed Apr 20 13:10:47 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Apr 20 13:11:07 2005
Subject: [gutvol-d] please test pg website cache
Message-ID: <4266B747.50409@perathoner.de>

We are planning to speed up the pg website by deploying a set of squid 
cache servers.

The first experimental squid cache is now online at:

   http://de.cache.gutenberg.org


This site should look and behave exactly the same as the original 
www.gutenberg.org site.

This site caches all static pages from www.gutenberg.org like the 
/browse/* and /etext/* pages. It does not cache dynamic generated pages 
like your search results. It does not cache the etext files.

If you are located in Western Europe (especially Germany) you should 
notice some speed improvement over www.gutenberg.org. (This of course 
will be marginal unless a considerable number of people start using it.)

This url is for testing only and will be removed once testing is 
complete, so don't publish this url on other web sites.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Wed Apr 20 13:14:04 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 13:14:16 2005
Subject: [gutvol-d] New Copyright law
Message-ID: <13d.11b07254.2f98120c@aol.com>

tom said:
>   I'm new to all this...is there some well-established reason 
>   why you seem to hate everyone on this list so much, 
>   and yet continue to subscribe to it?

it's simply not true that i "hate everyone on this list so much".

to begin with, i know very few of the people here _personally_,
and i wouldn't hold an emotion as strong as "hate" for anyone
that i didn't know personally.  in fact, i rarely "hate" anyone,
even if i _do_ know them personally.  life is too short for that.

i'm certainly not gonna be putting up webpages about anyone.

indeed, i have a fond feeling for most of the people here,
just as i do for project gutenberg as an abstract entity,
because the people here are one with me in the cause of
electronic-books, which is why i subscribe to this listserve.
i've been active in e-books for 25 years.  why wouldn't i be here?

i _do_ have some impatience with an inability or unwillingness
to consider things in their broader context, a shortcoming that
i see here on a not-infrequent basis, so i mount an attack on that.

for instance, i still ain't seen one word on this list about the recent
amazon purchases, and how they might affect the world of e-books.

so i think you people here need a wake-up call.  so i give it to you.

you shouldn't take my posts personally though.

unless you're one of the people putting up webpages about me.     :+)

as for the posts _today_, i'm just celebrating 4/20 with a little fun,
and reminding dave (and you, and everyone else ) that whether or not
i choose to post here is up to me...  at least until greg bans me again...

i'm also writing backchannels today, and posts to another listserve,
about the issue of error-correction, so i thought i would post here too.
i want to make it clear that i have _tried_ to communicate with y'all,
even though there are many people here trying to get me to shut up...
i have guts to say it to your face, even if you don't have guts to hear it.

so, tom, now, do you want to talk about issues?  or about me?

and if you _do_ want to talk about me, one measly post ain't much.
after all, i've got people putting up webpages about me...        :+)


>   (I mean, I know there are people who are just born that way)

i see...

-bowerbird
From jon_niehof at yahoo.com  Wed Apr 20 13:54:25 2005
From: jon_niehof at yahoo.com (Jon Niehof)
Date: Wed Apr 20 13:54:35 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: 6667
Message-ID: <20050420205426.73210.qmail@web41624.mail.yahoo.com>

--- Bowerbird@aol.com wrote:
> melissa said:
> > If we are going to post incomplete books, the notice
> > should not be just for 'our own people' but should be very
> > clearly stated to the users of PG. 
> 
> i agree.  it should be noted right there in the e-text itself.
> which, in the sole example given, was precisely the case...

But isn't that a form of markup? I thought markup was bad?

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From ke at gnu.franken.de  Wed Apr 20 13:52:09 2005
From: ke at gnu.franken.de (Karl Eichwalder)
Date: Wed Apr 20 15:38:35 2005
Subject: [gutvol-d] Re: Questions on PG and other ebooks
In-Reply-To: <16997.25608.758628.394332@celery.zuhause.org> (Bruce Albrecht's
	message of "Tue, 19 Apr 2005 15:03:20 -0500")
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<16997.25608.758628.394332@celery.zuhause.org>
Message-ID: <shd5spkw7a.fsf@tux.gnu.franken.de>

Bruce Albrecht <bruce@zuhause.org> writes:

> Personally, I would not buy books from a publisher with a reputation
> for knowingly publishing books with pages missing, nor would I want to
> download from PG if it had a reputation for knowingly publishing
> etexts that are missing pages.

I bought incomplete books - they were cheap and I was mostly interested
in the photographs.  As long as incomplete books are properly described
and listed, it would be useful to offer them for download.

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C
From distributedmel at gmail.com  Wed Apr 20 15:58:55 2005
From: distributedmel at gmail.com (Melissa Er-Raqabi)
Date: Wed Apr 20 15:59:09 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <d2.2720929f.2f980ea0@aol.com>
References: <d2.2720929f.2f980ea0@aol.com>
Message-ID: <a5a2a502050420155865381c73@mail.gmail.com>

I'm pretty busy right now, bowerbird, with post-processing complete
texts for upload to PG--90 or so in the last 6-8 months.

Melissa

> 
> but melissa, since you're on this kick today, how about if _you_
> offer to set up such a system.  maybe they'll accept your offer.
> one task you'd do is to scour the library for incomplete e-texts.
>
From Bowerbird at aol.com  Wed Apr 20 16:09:47 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 16:10:09 2005
Subject: [gutvol-d] Questions on PG and other ebooks
Message-ID: <bf.55e29458.2f983b3b@aol.com>

melissa said:
>   I'm pretty busy right now, bowerbird, with 
>   post-processing complete texts for upload to PG--
>   90 or so in the last 6-8 months.

i understand.        :+)

thanks for taking time out of your busy schedule
to come and give your recommendations on what
project gutenberg should be doing.  i appreciate
hearing everyone's input twice as much as i like
giving my own, because i have two ears and only
one mouth.

that's another reason i post every once in a while,
because otherwise, y'all lapse into long periods of
silence.  but when _i_ post, there's a _reaction_...

and thanks for post-processing all those e-texts.
it's not a pretty job, but somebody's gotta do it...

-bowerbird
From Bowerbird at aol.com  Wed Apr 20 16:17:06 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 16:17:25 2005
Subject: [gutvol-d] speaking of missing pages
Message-ID: <f4.4f46c53f.2f983cf2@aol.com>

speaking of "missing pages", i got this today...

-bowerbird

-------------------------------------------------------------------


Fresh from London's "Independent"

Some possibly very exciting news:


Decoded at last: the 'classical holy grail' that may 
rewrite the history of the world


Scientists begin to unlock the secrets of papyrus scraps bearing 
long-lost words by the literary giants of Greece and Rome

By David Keys and Nicholas Pyke


17 April 2005


For more than a century, it has caused excitement and frustration in equal

measure - a collection of Greek and Roman writings so vast it could redraw

the map of classical civilisation. If only it was legible.


Now, in a breakthrough described as the classical equivalent of finding the

holy grail, Oxford University scientists have employed infra-red technology

to open up the hoard, known as the Oxyrhynchus Papyri, and with it the

prospect that hundreds of lost Greek comedies, tragedies and epic poems will

soon be revealed.


In the past four days alone, Oxford's classicists have used it to make a

series of astonishing discoveries, including writing by Sophocles,

Euripides, Hesiod and other literary giants of the ancient world, lost for

millennia. They even believe they are likely to find lost Christian gospels,

the originals of which were written around the time of the earliest books of

the New Testament.


The original papyrus documents, discovered in an ancient rubbish dump in

central Egypt, are often meaningless to the naked eye - decayed, worm-eaten

and blackened by the passage of time. But scientists using the new

photographic technique, developed from satellite imaging, are bringing the

original writing back into view. Academics have hailed it as a development

which could lead to a 20 per cent increase in the number of great Greek and

Roman works in existence. Some are even predicting a "second Renaissance".


Christopher Pelling, Regius Professor of Greek at the University of Oxford,

described the new works as "central texts which scholars have been

speculating about for centuries".


Professor Richard Janko, a leading British scholar, formerly of University

College London, now head of classics at the University of Michigan, said:

"Normally we are lucky to get one such find per decade." One discovery in

particular, a 30-line passage from the poet Archilocos, of whom only 500

lines survive in total, is described as "invaluable" by Dr Peter Jones,

author and co-founder of the Friends of Classics campaign.


The papyrus fragments were discovered in historic dumps outside the

Graeco-Egyptian town of Oxyrhynchus ("city of the sharp-nosed fish") in

central Egypt at the end of the 19th century. Running to 400,000 fragments,

stored in 800 boxes at Oxford's Sackler Library, it is the biggest hoard of

classical manuscripts in the world.


The previously unknown texts, read for the first time last week, include

parts of a long-lost tragedy - the Epigonoi ("Progeny") by the 5th-century

BC Greek playwright Sophocles; part of a lost novel by the 2nd-century Greek

writer Lucian; unknown material by Euripides; mythological poetry by the

1st-century BC Greek poet Parthenios; work by the 7th-century BC poet

Hesiod; and an epic poem by Archilochos, a 7th-century successor of Homer,

describing events leading up to the Trojan War. Additional material from

Hesiod, Euripides and Sophocles almost certainly await discovery.


Oxford academics have been working alongside infra-red specialists from

Brigham Young University, Utah. Their operation is likely to increase the

number of great literary works fully or partially surviving from the ancient

Greek world by up to a fifth. It could easily double the surviving body of

lesser work - the pulp fiction and sitcoms of the day.


"The Oxyrhynchus collection is of unparalleled importance - especially now

that it can be read fully and relatively quickly," said the Oxford academic

directing the research, Dr Dirk Obbink. "The material will shed light on

virtually every aspect of life in Hellenistic and Roman Egypt, and, by

extension, in the classical world as a whole."


The breakthrough has also caught the imagination of cultural commentators.

Melvyn Bragg, author and presenter, said: "It's the most fantastic news.

There are two things here. The first is how enormously influential the

Greeks were in science and the arts. The second is how little of their

writing we have. The prospect of having more to look at is wonderful."


Bettany Hughes, historian and broadcaster, who has presented TV series

including Mysteries of the Ancients and The Spartans, said: "Egyptian

rubbish dumps were gold mines. The classical corpus is like a jigsaw puzzle

picked up at a jumble sale - many more pieces missing than are there.

Scholars have always mourned the loss of works of genius - plays by

Sophocles, Sappho's other poems, epics. These discoveries promise to change

the textual map of the golden ages of Greece and Rome."


When it has all been read - mainly in Greek, but sometimes in Latin, Hebrew,

Coptic, Syriac, Aramaic, Arabic, Nubian and early Persian - the new material

will probably add up to around five million words. Texts deciphered over the

past few days will be published next month by the London-based Egypt

Exploration Society, which financed the discovery and owns the collection.


A 21st-century technique reveals antiquity's secrets


Since it was unearthed more than a century ago, the hoard of documents known

as the Oxyrhynchus Papyri has fascinated classical scholars. There are

400,000 fragments, many containing text from the great writers of antiquity.

But only a small proportion have been read so far. Many were illegible.


Now scientists are using multi-spectral imaging techniques developed from

satellite technology to read the papyri at Oxford University's Sackler

Library. The fragments, preserved between sheets of glass, respond to the

infra-red spectrum - ink invisible to the naked eye can be seen and

photographed.


The fragments form part of a giant "jigsaw puzzle" to be reassembled.

Missing "pieces" can be supplied from quotations by later authors, and

grammatical analysis.


Key words from the master of Greek tragedy


Speaker A: . . . gobbling the whole, sharpening the flashing iron.


Speaker B: And the helmets are shaking their purple-dyed crests, and for the

wearers of breast-plates the weavers are striking up the wise shuttle's

songs, that wakes up those who are asleep.


Speaker A: And he is gluing together the chariot's rail.


These words were written by the Greek dramatist Sophocles, and are the only

known fragment we have of his lost play Epigonoi (literally "The Progeny"),

the story of the siege of Thebes. Until last week's hi-tech analysis of

ancient scripts at Oxford University, no one knew of their existence, and

this is the first time they have been published.


Sophocles (495-405 BC), was a giant of the golden age of Greek civilisation,

a dramatist who work alongside and competed with Aeschylus, Euripides and

Aristophanes.


His best-known work is Oedipus Rex, the play that later gave its name to the

Freudian theory, in which the hero kills his father and marries his mother -

in a doomed attempt to escape the curse he brings upon himself. His other

masterpieces include Antigone and Electra.


Sophocles was the cultured son of a wealthy Greek merchant, living at the

height of the Greek empire. An accomplished actor, he performed in many of

his own plays. He also served as a priest and sat on the committee that

administered Athens. A great dramatic innovator, he wrote more than 120

plays, but only seven survive in full.


Last week's remarkable finds also include work by Euripides, Hesiod and

Lucian, plus a large and particularly significant paragraph of text from the

Elegies, by Archilochos, a Greek poet of the 7th century BC.
From Bowerbird at aol.com  Wed Apr 20 16:42:57 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Apr 20 16:43:15 2005
Subject: [gutvol-d] Questions on PG and other ebooks
Message-ID: <2b.7193db38.2f984301@aol.com>

jon said:
>   But isn't that a form of markup? 

i don't know.

periods are markup, some people say.
and blank lines between paragraphs.
even the space between two words.

so i don't know.  define it however you want.


>   I thought markup was bad?

who told you that?

markup, depending on how you define it, is 
whatever it is.  it's not bad or good.  it just is.

having to _apply_ markup -- especially heavy markup --
takes a lot of time and energy, so that can be _costly_.

but if it delivers _benefits_, though, especially benefits
that cannot be obtained another way, then it _might_ be 
cost-efficient.  you have to put it on a scale and weigh it.

costs are bad.  benefits are good.  so you find the balance...

if i could wave a magic wand, and have the entire library
marked up, in x.m.l. or some other form of markup, i would
love it.  why not?

or, alternately, if someone else is willing to do all that
heavy markup for me, while i sit back and drink beer,
i would love that too.  why not?

but if y'all are gonna sit around, for year after year after
year after year, all the while intending to do x.m.l. markup,
but never actually getting any done, what's the point?

myself, i like my markup to be "invisible" -- to be _zen_.
like spaces between words, blank lines between paragraphs.
to _facilitate_ my understanding of the intent of the author.

(illustrated by the underscores i just used to indicate italics,
so you'd know that my intention was to emphasize that word.
of course, it's nice if your viewer-program would _convert_
an underscored word to an italicized one when displaying it,
as some viewer-programs will -- like ubook, for instance --
but in the absence of that, i'll trust your imagination to do it.)

all of which has absolutely nothing to do with the case here.

if there are two pages missing inside an e-text, _say_so_.
say it right where they're missing, at the top of the e-text,
on the webpage that lists all the e-texts with missing pages,
in the newsletter, and anywhere else where you believe that
it might come to the attention of someone who can and will
_provide_ those missing pages.  all of this is in keeping with
a systemic effort to incorporate end-users as _co-creators_
and _co-owners_ of their planetary digital library.  you dig?

because the quickest way to get 20 million books digitized is
for 20 million people to do one book each, then check one other.
now _that_ would be distributing the workload!

if anyone else wants to converse with me, today is your day.
it won't be 4/20 tomorrow, so speak up now if you want...

-bowerbird
From gbnewby at pglaf.org  Wed Apr 20 22:10:24 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Wed Apr 20 22:10:26 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <6d99d1fd05041914426b0436ac@mail.gmail.com>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
Message-ID: <20050421051024.GE6638@pglaf.org>

On Tue, Apr 19, 2005 at 04:42:56PM -0500, David Starner wrote:
> On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
> > Any reason not to post them with a comment that these pages are missing?
> > 
> > Readers would thus be encouraged to help find the missing pages.
> 
> If you look at book 13921, you'll notice that it's missing pages 98
> and 99 ("[Seiten 98 und 99 fehlen!]", embedded in the middle of the
> text). How many readers have jumped forward to offer the missing
> pages? If it had been kept at DP, we could have found the pages and
> added them. But once it's on the shelf, nobody worries about it
> anymore.
> 
> I've found as a general rule, once a book is posted, the odds of
> anything getting done on it drop vastly. It gets moved to the
> completed pile, and new books take its place.

I feel like I might be stepping on a hornet's nest, so please
try to be gentle with me:

My questions are two:

1. What is the approximate success rate & timetable for getting missing
pages for books in DP?  (I.e., how many books are stalled for missing
pages, and how many have had their pages found/restored, and how long
after proofreading was complete did this happen?)

2. I'm aware there are a sizeable number of books at DP that have
completed proofreading, yet are not yet uploaded to the PG servers.
What proportion is awaiting missing pages, versus other types of delays.


I want to offer two things, also:

a) We can run requests in the newsletters for particular items.
These go out to > 6K subscribers, and we might get some positive
responses.  I think Branko Collins was looking to provide
some regular DP content to Michael Hart for the newsletter -
or, just email stuff to Michael or me.

b) ditto for the gutenberg.org Web page: a "wanted" area (with
lots of changing content -- drawn from a list of titles missing
pages) would probably get a lotta clicks.

  -- Greg
From gbnewby at pglaf.org  Wed Apr 20 22:17:44 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Wed Apr 20 22:17:45 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <15cfa2a5050419141222f2a8ad@mail.gmail.com>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6.2.0.8.0.20050419104217.01edb008@mail.scripps.edu>
	<003701c54522$b6cfbf40$f3c2883e@freeserve.co.uk>
	<15cfa2a5050419141222f2a8ad@mail.gmail.com>
Message-ID: <20050421051744.GF6638@pglaf.org>

On Tue, Apr 19, 2005 at 05:12:43PM -0400, Robert Cicconetti wrote:
> On 4/19/05, Phil Hitchcock <phil@hitchcock99.freeserve.co.uk> wrote:
> > PG already issues books with missing pages, e.g. #11866. However it is
> > stated at the beginning that certain specified pages are missing, so the
> > reader knows what to expect. If the currently best available copy of a text,
> > which may be several hundred years old, is missing a few pages, well that is
> > unfortunate; but surely it is better to give people the chance to read the
> > 99% that is available. Our great museums do not say, this pot has a few
> > chips in it so we will not exhibit it.
> 
> The main reason to avoid incomplete projects at DP is a lack of
> resources, both of skilled people and technical resources. In fact
> they go together; the Post Processing backlog at DP is causing a
> chronic shortage of disk space. If a project has to sit on the server
...

I'm posting here, in case discussion has stalled or this message didn't
get to the right person previously: We're perpetually ready to acquire
additional hardware for DP.

I can also offer lots of off-site networked storage for backups,
"holding" items, etc., etc.  There have been numerous short discussions
about this, but it sounds like most DP folks are busy doing other
things, and haven't had cycles to work on expanding infrastructure.  So,
in case this helps, I want to reiterate that funding for DP's
hardware/network/backups/storage infrastructure is available.

> for 6 additional months waiting for 2 pages, that is not good. Also,
> by posting an incomplete work, you add to already heavy PP work load.

Just a quick note that for posted eBooks such errata/additions can go to
the errata list (errata AT pglaf.org).  They don't need to go back to
the PPer (though in some cases they might need to).  The errata team
is also overworked, of course...

If we do a lot of this, and it involves starting with OCR &
proofreading, then I agree it's non-trivial no matter who gets the page
scans.  But if we can get the scan/page donor to supply proofread
text, it's much easier.
  -- Greg

From grythumn at gmail.com  Wed Apr 20 23:34:02 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Wed Apr 20 23:34:20 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <20050421051024.GE6638@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org>
Message-ID: <15cfa2a505042023341fbeb5f7@mail.gmail.com>

On 4/21/05, Greg Newby <gbnewby@pglaf.org> wrote:
> On Tue, Apr 19, 2005 at 04:42:56PM -0500, David Starner wrote:
> > On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
> I feel like I might be stepping on a hornet's nest, so please
> try to be gentle with me:
> 
> My questions are two:
> 
> 1. What is the approximate success rate & timetable for getting missing
> pages for books in DP?  (I.e., how many books are stalled for missing
> pages, and how many have had their pages found/restored, and how long
> after proofreading was complete did this happen?)

Nobody really keeps track. There is a hundred lines of changelog in
the forums on the missing pages wiki that could probably provide some
information, but not every book goes through there. The second post in
that thread is a list of volunteers and the libraries that they have
access to; you can PM those people directly, or request the book
yourself through ILL, or find a copy on ebay, or.. Worldcat and the
library list is a very effective combination.
 
> 2. I'm aware there are a sizeable number of books at DP that have
> completed proofreading, yet are not yet uploaded to the PG servers.
> What proportion is awaiting missing pages, versus other types of delays.

There are ~25 books listed on the missing pages wiki; several have
been claimed by someone. There are 636 books waiting for a post
processor to claim them, 1600 claimed for post processing, 126 waiting
for verification, and 175 being verified. Note that current policy is
that incomplete books should not be uploaded to DP.

R C
From traverso at dm.unipi.it  Thu Apr 21 00:04:38 2005
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Thu Apr 21 00:02:08 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <15cfa2a505042023341fbeb5f7@mail.gmail.com> (message from Robert
	Cicconetti on Thu, 21 Apr 2005 02:34:02 -0400)
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org>
	<15cfa2a505042023341fbeb5f7@mail.gmail.com>
Message-ID: <200504210704.j3L74ch11399@pico.dm.unipi.it>


It is much better if the incomplete projects remain at DP: when pages
are found, DP updates both the text and the images, so that, when
these will be made available, these will be complete too.

However a page at PG with a list of current requests might be very
useful. A page with a line for book, with a link to a description of
the problem. When fixed, the line will be moved to a different
position for a while, with thanks.

Of course the page could be useful for non-DP too; and may contain
requests for books in PG that need maintenance. 


Carlo
From marcello at perathoner.de  Thu Apr 21 03:54:00 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Apr 21 03:54:25 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <bf.55e29458.2f983b3b@aol.com>
References: <bf.55e29458.2f983b3b@aol.com>
Message-ID: <42678648.8050906@perathoner.de>

Bowerbird@aol.com wrote:

> that's another reason i post every once in a while,
> because otherwise, y'all lapse into long periods of
> silence.  but when _i_ post, there's a _reaction_...

Face it, provoking a reaction isn't the same thing as saying something 
significant.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From M.J.Farmer at bham.ac.uk  Thu Apr 21 02:44:18 2005
From: M.J.Farmer at bham.ac.uk (Malcolm Farmer)
Date: Thu Apr 21 04:30:35 2005
Subject: [gutvol-d] speaking of missing pages
In-Reply-To: <f4.4f46c53f.2f983cf2@aol.com>
References: <f4.4f46c53f.2f983cf2@aol.com>
Message-ID: <426775F2.2@bham.ac.uk>

Bowerbird@aol.com wrote:

>speaking of "missing pages", i got this today...
>
>-bowerbird
>
>-------------------------------------------------------------------
>
>
>Fresh from London's "Independent"
>
>Some possibly very exciting news:
>
>
>Decoded at last: the 'classical holy grail' that may 
>rewrite the history of the world
>
>
>Scientists begin to unlock the secrets of papyrus scraps bearing 
>long-lost words by the literary giants of Greece and Rome
>  
>
I saw a feature on UK TV a year or more back about the half incinerated 
library found at Herculaneum. Those scrolls basically looked like 
charcoal briquettes, but could be make flexible by wetting with 
solvents, and prised carefully apart to put under a camera.  They showed 
someone with a multispectral camera stepping through wavelengths till he 
arrived at one where 'charred papyrus' had a different reflectivity to 
'charred papyrus+ink'. There was such an expression of delight on the 
researcher's face  as he saw the text coming up on screen and described 
what he was reading....

So this technique has been around a while, long enough to be shown in 
use on television.  I'm surprised it's taken so long to get attention 
from the press. Pretty cool technique, though, and worthy of a writeup.
From joshua at hutchinson.net  Thu Apr 21 05:18:51 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Thu Apr 21 05:20:06 2005
Subject: [gutvol-d] Questions on PG and other ebooks
Message-ID: <20050421121851.26D2F4F521@ws6-5.us4.outblaze.com>


----- Original Message -----
From: "Robert Cicconetti" <grythumn@gmail.com>
> 
> On 4/21/05, Greg Newby <gbnewby@pglaf.org> wrote:
> > On Tue, Apr 19, 2005 at 04:42:56PM -0500, David Starner wrote:
> > > On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
> > I feel like I might be stepping on a hornet's nest, so please
> > try to be gentle with me:
> >
> > My questions are two:
> >
> > 1. What is the approximate success rate & timetable for getting missing
> > pages for books in DP?  (I.e., how many books are stalled for missing
> > pages, and how many have had their pages found/restored, and how long
> > after proofreading was complete did this happen?)
> 
> Nobody really keeps track. 

This is true.  But, let me offer this completely personal, anecdotal evidence.

I've added 691 projects to the DP queue in the last couple years.  Granted, many of them are still in the queue waiting to be release (maybe 25% or more).  Out of those nearly 700 projects, 2 have been held up by missing pages.  The first one was about a year and half ago and I can't remember anymore how that one was resolved.  The other just cropped up last week.  The images were taken from Cornell's Making of America project and they had one page scan in twice, overwriting one page.  An e-mail sent to their administrators promised that they would fix it soon.  I'm giving them until next week before I bug them again.

If my numbers are any indication ... around 0.29% of our projects will come up missing a page.  I have a feeling it is a actually a point or two higher, but I could be wrong.

Josh
From vze3rknp at verizon.net  Thu Apr 21 06:14:07 2005
From: vze3rknp at verizon.net (Juliet Sutherland)
Date: Thu Apr 21 06:13:56 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <20050421121851.26D2F4F521@ws6-5.us4.outblaze.com>
References: <20050421121851.26D2F4F521@ws6-5.us4.outblaze.com>
Message-ID: <4267A71F.5070603@verizon.net>


Joshua Hutchinson wrote:

>----- Original Message -----
>From: "Robert Cicconetti" <grythumn@gmail.com>
>  
>
>>On 4/21/05, Greg Newby <gbnewby@pglaf.org> wrote:
>>    
>>
>>>On Tue, Apr 19, 2005 at 04:42:56PM -0500, David Starner wrote:
>>>      
>>>
>>>>On 4/19/05, Michael Hart <hart@pglaf.org> wrote:
>>>>        
>>>>
>>>I feel like I might be stepping on a hornet's nest, so please
>>>try to be gentle with me:
>>>
>>>My questions are two:
>>>
>>>1. What is the approximate success rate & timetable for getting missing
>>>pages for books in DP?  (I.e., how many books are stalled for missing
>>>pages, and how many have had their pages found/restored, and how long
>>>after proofreading was complete did this happen?)
>>>      
>>>
>>Nobody really keeps track. 
>>    
>>
>
>This is true.  But, let me offer this completely personal, anecdotal evidence.
>
>I've added 691 projects to the DP queue in the last couple years.  Granted, many of them are still in the queue waiting to be release (maybe 25% or more).  Out of those nearly 700 projects, 2 have been held up by missing pages.  The first one was about a year and half ago and I can't remember anymore how that one was resolved.  The other just cropped up last week.  The images were taken from Cornell's Making of America project and they had one page scan in twice, overwriting one page.  An e-mail sent to their administrators promised that they would fix it soon.  I'm giving them until next week before I bug them again.
>
>If my numbers are any indication ... around 0.29% of our projects will come up missing a page.  I have a feeling it is a actually a point or two higher, but I could be wrong.
>
>Josh
>
>  
>
I have put nearly 1200 books through DP of which 876 have been posted to 
PG. The majority of those are ones that I have scanned, so I have 
control over dealing with things like bad scans. I have a handful of 
project (~5) that have problems with bad or missing pages. 3 of them are 
from the Million Books Project and were processed before we knew we had 
to be very careful about checking those scans. The others are from 
periodicals. I also have 3 books that I've scanned but not put on DP due 
to missing pages (I'm quite sure I'll run across another copy of those 
books reasonably soon). I know of another 4-5 cases where material was 
cut off on the scan (or on the actual page) and was obtained by other 
volunteers so that the project could be finished. I've had a couple of  
items where the material is so obscure that I've told the PPer to just 
mark the missing parts (usually just a piece of a page) and post it. 
Bulletin de Lille (a French twice-weekly newspaper published by the 
German occupiers of Lille during WWI) is the example that comes to mind.

My impression is that missing pages/text are not a big problem in 
percentage terms at DP. But they do happen often enough that we have a 
procedure for dealing with them. We have some volunteers who have been 
remarkably responsive and helpful in finding missing pages or checking 
obscured text and who deserve unending thanks. I'm quite certain that 
these problems are more likely to be successfully resolved within the DP 
system, where we have methods for tracking them, than they would be if 
posted to PG where there is not yet a systematic way to find and work on 
them.

JulietS
DP Site Admin

From jon_niehof at yahoo.com  Thu Apr 21 07:51:17 2005
From: jon_niehof at yahoo.com (Jon Niehof)
Date: Thu Apr 21 07:51:23 2005
Subject: [gutvol-d] speaking of missing pages
In-Reply-To: 6667
Message-ID: <20050421145118.6074.qmail@web41623.mail.yahoo.com>

--- Malcolm Farmer <M.J.Farmer@bham.ac.uk> wrote:
> So this technique has been around a while, long enough to be
> shown in use on television.

Indeed; have a look at Ars' take:
http://arstechnica.com/news.ars/post/20050420-4827.html


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From shimmin at uiuc.edu  Thu Apr 21 08:36:46 2005
From: shimmin at uiuc.edu (Robert Shimmin)
Date: Thu Apr 21 08:36:53 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <20050421051024.GE6638@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>	<4262A37F.5020109@hutchinson.net>	<6d99d1fd05041721326f39efb7@mail.gmail.com>	<200504181821.51488.donovan@abs.net>	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org>
Message-ID: <4267C88E.8020403@uiuc.edu>


> 1. What is the approximate success rate & timetable for getting missing
> pages for books in DP?  (I.e., how many books are stalled for missing
> pages, and how many have had their pages found/restored, and how long
> after proofreading was complete did this happen?)

My experience as a DP projet manager is that if I simply put a page 
request up for grabs, I might get a bite, and I might not; I feel the 
half-life of these "passive" requests might be measured in months.

If I am more proactive, and look through library catalogs to identify a 
library that claims to have the book I need a page from, and then ask a 
fellow DP-er who might have access to that library, I get better 
results.  My experience is that if I ask two or three people, at least 
one of them will be willing and able to make the scan at their 
convenience, and they usually find it convenient to do so within a week 
or three.

So, if a person is willing to put the legwork into locating the book, 
they will probably get results.  Not quickly, but not glacially slowly, 
either.

> 2. I'm aware there are a sizeable number of books at DP that have
> completed proofreading, yet are not yet uploaded to the PG servers.
> What proportion is awaiting missing pages, versus other types of delays.

Who knows?  Other issues that hold books up are a small fraction of 
illegible text, needing to locate a classicist or speaker of a foreign 
language, and the project being Just Plain Hard (like doing html for a 
project with over 300 images).

-- RS
From jhowse at nf.sympatico.ca  Thu Apr 21 13:33:17 2005
From: jhowse at nf.sympatico.ca (JHowse)
Date: Thu Apr 21 09:02:51 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <4267C88E.8020403@uiuc.edu>
References: <20050421051024.GE6638@pglaf.org>
	<4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org>
Message-ID: <5.1.0.14.0.20050421131953.00a5cec0@pop1.nf.sympatico.ca>


>>2. I'm aware there are a sizeable number of books at DP that have
>>completed proofreading, yet are not yet uploaded to the PG servers.
>>What proportion is awaiting missing pages, versus other types of delays.
>
>Who knows?  Other issues that hold books up are a small fraction of 
>illegible text, needing to locate a classicist or speaker of a foreign 
>language, and the project being Just Plain Hard (like doing html for a 
>project with over 300 images).

Plus the fact that we are all volunteers on PG, and most of those prefer to 
do the proofreading, not the Post Processing. I am going as fast as I can 
and I'm sure that goes for the other PPers. :D

JHowse


                        ================================================================================
                        "I'm not likely to write a great novel or compose a 
song or save a baby from a burning building...but I can help
                         make sure that there is an electronic library of 
free knowledge available for future people to access."--jhutch.
                                                                        Preserving 
History One Page at a Time!!
                                                             Celebrating 
our 6600th book posted to Project Gutenberg
                                                  Join Project Gutenberg's 
Distributed Proofreaders http://www.pgdp.net/c/
                        ================================================================================

From hart at pglaf.org  Thu Apr 21 11:26:33 2005
From: hart at pglaf.org (Michael Hart)
Date: Thu Apr 21 11:26:34 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <4267C88E.8020403@uiuc.edu>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org> <4267C88E.8020403@uiuc.edu>
Message-ID: <Pine.LNX.4.60.0504211124100.20792@pglaf.org>


On Thu, 21 Apr 2005, Robert Shimmin wrote:

>
>> 1. What is the approximate success rate & timetable for getting missing
>> pages for books in DP?  (I.e., how many books are stalled for missing
>> pages, and how many have had their pages found/restored, and how long
>> after proofreading was complete did this happen?)
>
> My experience as a DP projet manager is that if I simply put a page request 
> up for grabs, I might get a bite, and I might not; I feel the half-life of 
> these "passive" requests might be measured in months.
>
> If I am more proactive, and look through library catalogs to identify a 
> library that claims to have the book I need a page from, and then ask a 
> fellow DP-er who might have access to that library, I get better results.  My 
> experience is that if I ask two or three people, at least one of them will be 
> willing and able to make the scan at their convenience, and they usually find 
> it convenient to do so within a week or three.
>
> So, if a person is willing to put the legwork into locating the book, they 
> will probably get results.  Not quickly, but not glacially slowly, either.


If you would send me such requests for inclusion in the Newsletter,
that might help.

When we put requests for such materials in the Newsletter, we usually
get a response within a week about about half the time.

This goes up to about 3/4 if we leave the request in for a month.

Michael

From hart at pglaf.org  Thu Apr 21 11:47:50 2005
From: hart at pglaf.org (Michael Hart)
Date: Thu Apr 21 11:47:51 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <200504210704.j3L74ch11399@pico.dm.unipi.it>
References: <4261D8F9.4000805@frontiernet.net>
	<4262A37F.5020109@hutchinson.net>
	<6d99d1fd05041721326f39efb7@mail.gmail.com>
	<200504181821.51488.donovan@abs.net>
	<Pine.LNX.4.60.0504190939530.23081@pglaf.org>
	<6d99d1fd05041914426b0436ac@mail.gmail.com>
	<20050421051024.GE6638@pglaf.org>
	<15cfa2a505042023341fbeb5f7@mail.gmail.com>
	<200504210704.j3L74ch11399@pico.dm.unipi.it>
Message-ID: <Pine.LNX.4.60.0504211146550.20792@pglaf.org>

On Thu, 21 Apr 2005, Carlo Traverso wrote:

>
> It is much better if the incomplete projects remain at DP: when pages
> are found, DP updates both the text and the images, so that, when
> these will be made available, these will be complete too.
>
> However a page at PG with a list of current requests might be very
> useful. A page with a line for book, with a link to a description of
> the problem. When fixed, the line will be moved to a different
> position for a while, with thanks.
>
> Of course the page could be useful for non-DP too; and may contain
> requests for books in PG that need maintenance.


Is there any reason these projects cannot be kept at DP as suggested
and also still shared with the world?


Michael

From donovan at abs.net  Thu Apr 21 15:23:16 2005
From: donovan at abs.net (D Garcia)
Date: Thu Apr 21 15:22:07 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504211146550.20792@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net>
	<200504210704.j3L74ch11399@pico.dm.unipi.it>
	<Pine.LNX.4.60.0504211146550.20792@pglaf.org>
Message-ID: <200504211823.17073.donovan@abs.net>

On Thursday 21 April 2005 02:47 pm, Michael Hart wrote:
> On Thu, 21 Apr 2005, Carlo Traverso wrote:
> > It is much better if the incomplete projects remain at DP: when pages
> > are found, DP updates both the text and the images, so that, when
> > these will be made available, these will be complete too.
>
> Is there any reason these projects cannot be kept at DP as suggested
> and also still shared with the world?
>
> Michael

Yes, we like to be thorough. :)
From donovan at abs.net  Thu Apr 21 15:24:49 2005
From: donovan at abs.net (D Garcia)
Date: Thu Apr 21 15:23:37 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504211124100.20792@pglaf.org>
References: <4261D8F9.4000805@frontiernet.net> <4267C88E.8020403@uiuc.edu>
	<Pine.LNX.4.60.0504211124100.20792@pglaf.org>
Message-ID: <200504211824.49370.donovan@abs.net>

On Thursday 21 April 2005 02:26 pm, Michael Hart eventually had this to say 
about missing page requests:
> If you would send me such requests for inclusion in the Newsletter,
> that might help.
>
> Michael

Now that is an excellent suggestion!
From tb at baechler.net  Thu Apr 21 23:16:56 2005
From: tb at baechler.net (Tony Baechler)
Date: Thu Apr 21 23:16:50 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <200504211823.17073.donovan@abs.net>
References: <Pine.LNX.4.60.0504211146550.20792@pglaf.org>
	<4261D8F9.4000805@frontiernet.net>
	<200504210704.j3L74ch11399@pico.dm.unipi.it>
	<Pine.LNX.4.60.0504211146550.20792@pglaf.org>
Message-ID: <5.2.0.9.0.20050421230947.03e12280@baechler.net>

At 06:23 PM 4/21/2005 -0400, you wrote:
>On Thursday 21 April 2005 02:47 pm, Michael Hart wrote:
> > On Thu, 21 Apr 2005, Carlo Traverso wrote:
> > > It is much better if the incomplete projects remain at DP: when pages
> > > are found, DP updates both the text and the images, so that, when
> > > these will be made available, these will be complete too.
> >
> > Is there any reason these projects cannot be kept at DP as suggested
> > and also still shared with the world?


Hi.  This is only an idea, so if it's not practical, my apologies and 
please disregard.  Why not start a subproject either within PG or DP that 
could still post the books as long as it is clearly understood that they 
are not official PG books and have x pages missing?  Maybe the PGCC could 
have such a collection.  This way people could still see the books in a 
transitional statt of completeness while they would not become a part of 
the PG archive.  To take this a step further, they wouldn't be assigned PG 
ebook numbers and maybe the PGWW people won't even have to be involved, 
since the books are in a transitional state anyway.  I guess it would be 
similar to the second round of proofreading in DP, the book still has 
errors, missing pages, etc but is available for all to see.  The wanted 
requests could still be posted to the main PG site and newsletter in hopes 
that volunteers will find such missing pages more quickly.

Since there would still need to be a way to keep track of these substandard 
books, give them the DP project numbers or no numbers at all.  They would 
still stay within DP, they would just be released earlier with notes that x 
pages are missing, x more proofing needs to be done, etc.  Again, maybe 
PGCC would be best for this so it's not directly associated with the PG 
archive. 

From traverso at dm.unipi.it  Fri Apr 22 00:23:31 2005
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Fri Apr 22 00:20:48 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <5.2.0.9.0.20050421230947.03e12280@baechler.net> (message from
	Tony Baechler on Thu, 21 Apr 2005 23:16:56 -0700)
References: <Pine.LNX.4.60.0504211146550.20792@pglaf.org>
	<4261D8F9.4000805@frontiernet.net>
	<200504210704.j3L74ch11399@pico.dm.unipi.it>
	<Pine.LNX.4.60.0504211146550.20792@pglaf.org>
	<5.2.0.9.0.20050421230947.03e12280@baechler.net>
Message-ID: <200504220723.j3M7NVM18076@pico.dm.unipi.it>

>>>>> "Michael" == Michael Hart <hart@pglaf.org> writes:

    Michael> On Thu, 21 Apr 2005, Carlo Traverso wrote:

    >>  It is much better if the incomplete projects remain at DP:
    >> when pages are found, DP updates both the text and the images,
    >> so that, when these will be made available, these will be
    >> complete too.
    >> 
    >> However a page at PG with a list of current requests might be
    >> very useful. A page with a line for book, with a link to a
    >> description of the problem. When fixed, the line will be moved
    >> to a different position for a while, with thanks.
    >> 
    >> Of course the page could be useful for non-DP too; and may
    >> contain requests for books in PG that need maintenance.


    Michael> Is there any reason these projects cannot be kept at DP
    Michael> as suggested and also still shared with the world?


In the forthcoming code release DP will have the so-called "Smooth
reading pool", in which books that have passed (most of) the
post-processing steps are made available for download for a final
reading, identifying the further corrections needed. 

IIRC, download will be available to non-registered users too,
(re-upload for registered users only). While availability is meant for
a short period only, it can be used for projects with missing pages
for as long as it is needed (until upload at PG).

Carlo
From collin at xs4all.nl  Fri Apr 22 05:25:23 2005
From: collin at xs4all.nl (Branko Collin)
Date: Fri Apr 22 05:13:24 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <Pine.LNX.4.60.0504211146550.20792@pglaf.org>
References: <200504210704.j3L74ch11399@pico.dm.unipi.it>
Message-ID: <42690953.26409.256859@localhost>


On 21 Apr 2005, at 11:47, Michael Hart wrote:
> On Thu, 21 Apr 2005, Carlo Traverso wrote:
> 
> > It is much better if the incomplete projects remain at DP: when
> > pages are found, DP updates both the text and the images, so that,
> > when these will be made available, these will be complete too.
>
> Is there any reason these projects cannot be kept at DP as suggested
> and also still shared with the world?

No such reason, but I will come to that later.

There are philosophical differences between PG and DP that hardly 
ever come to light except in instances such as now.

One is that DP doesn't care how long it takes before a public domain 
book is presented to the public. This is part of its very make-up; we 
distribute the work in bits that are as small as possible, and there 
are very few stakeholders who have a large interest in what finally 
will happen to the book. If neither the scanner or the post-processor 
care very much _when_ the book will be released, there is a chance 
that a text will be sat upon until it's ready, not until it's time.

The other difference is that nitpickers are drawn to DP the way moths 
are drawn to a flame. A lot of volunteers at DP care more about the 
quality of the works we put out than the quantity. We don't want to 
produce as many books as possible for as long a time as possible 
(part of PG's main philosophy), we want to produce good books.

Obviously I am exagerating the differences a great deal; I make it 
sound like PG does not care about quality, and obviously that is not 
true. I also make it sound that books sit forever at DP, while 
proofreading monks chip away at the tiniest of imperfections, which 
is also not true. But the differences that there are may account for 
why books are apparently sitting longer at DP than PG would like.

I can see several solutions for this: 

- Spring cleaning; the powers that be at DP regularly organize 
proofreading / post-processing / mentoring / whatever marathons, 
whenever they feel something needs extra attention. If there are 
truly books that have been sitting at DP for too long, we can try and 
organize something like that to flush out the forgotten projects.

- Assign quality levels; currently, a PG text is a PG text is a PG 
text no matter how much effort and attention has gone into it. This 
means there is a variety in quality that is currently not accounted 
for. (As a consequence, our bad texts are dragging down our 
reputation, causing PG's goal to reach out to as many people to miss 
the mark. Some people won't read our books because of their 
reputation--see my recent discussion with David Rothman at the 
Teleread blog.)

I can see several disadvantages and several advantages to this 
proposal.

The disadvantages: 

1. PG has never liked putting out "editions". I am not sure why. 
Quality levels are like "editions".

2. Someone has to build it before we can use it. Things can go wrong 
while we use it. Readers might not understand what each level means.

3. On the PG side, someone has to check (whitewash) a book at every 
level, not just once. Corrections may have to performed to multiple 
versions, if we choose to retain versions at older levels.

The advantages: 

1. We can publish books during several stages of its restoration 
phase. Currently the following stages would make sense to me: 

a. After scanning (and perhaps OCR-ing)

b. After proofreading/post-processing

c. After extended mark-up/proofreading phases (what would be 
smoothreading at DP)

2. We can keep the process more transparent.

3. Users can choose between quality levels: have an unchecked, 
incomplete book now, or wait for the improved version.


-- 
branko collin
collin@xs4all.nl
From collin at xs4all.nl  Fri Apr 22 05:42:10 2005
From: collin at xs4all.nl (Branko Collin)
Date: Fri Apr 22 05:30:15 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <20050421051744.GF6638@pglaf.org>
References: <15cfa2a5050419141222f2a8ad@mail.gmail.com>
Message-ID: <42690D42.13117.34C4A1@localhost>


I think the person responsible for server management at DP is the 
much overworked Pauline/Pourlean, so I am forwarding the following to 
her per this reply.

On 20 Apr 2005, at 22:17, Greg Newby wrote:

> I'm posting here, in case discussion has stalled or this message
> didn't get to the right person previously: We're perpetually ready to
> acquire additional hardware for DP.
> 
> I can also offer lots of off-site networked storage for backups,
> "holding" items, etc., etc.  There have been numerous short
> discussions about this, but it sounds like most DP folks are busy
> doing other things, and haven't had cycles to work on expanding
> infrastructure.  So, in case this helps, I want to reiterate that
> funding for DP's hardware/network/backups/storage infrastructure is
> available.

-- 
branko collin
collin@xs4all.nl
From jon_niehof at yahoo.com  Fri Apr 22 09:04:25 2005
From: jon_niehof at yahoo.com (Jon Niehof)
Date: Fri Apr 22 09:04:32 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: 6667
Message-ID: <20050422160425.45164.qmail@web41611.mail.yahoo.com>

--- Tony Baechler <tb@baechler.net> wrote:
> Why not start a subproject either within PG or DP that could
> still post the books as long as it is clearly understood that
> they are not official PG books and have x pages missing?

The concern I'd have with this sort of approach is that, once
it's on the web, it tends to get copied all over. Even once a
completed version is out, "broken" versions may very well
continue to outnumber the other, and the value gets lost in the
noise.

This isn't an image issue, but rather a "service to posterity" issue.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From servalan at ar.com.au  Sat Apr 23 02:12:57 2005
From: servalan at ar.com.au (Pauline)
Date: Sat Apr 23 02:13:40 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <42690D42.13117.34C4A1@localhost>
References: <15cfa2a5050419141222f2a8ad@mail.gmail.com>
	<42690D42.13117.34C4A1@localhost>
Message-ID: <426A1199.7080903@ar.com.au>

Branko Collin wrote:
> I think the person responsible for server management at DP is the 
> much overworked Pauline/Pourlean, so I am forwarding the following to 
> her per this reply.

Thanks. I haven't been keeping up with mailing lists at all recently.

> 
> On 20 Apr 2005, at 22:17, Greg Newby wrote:
>
>>I'm posting here, in case discussion has stalled or this message
>>didn't get to the right person previously: We're perpetually ready to
>>acquire additional hardware for DP.
>>
>>I can also offer lots of off-site networked storage for backups,
>>"holding" items, etc., etc.  There have been numerous short
>>discussions about this, but it sounds like most DP folks are busy
>>doing other things, and haven't had cycles to work on expanding
>>infrastructure.  So, in case this helps, I want to reiterate that
>>funding for DP's hardware/network/backups/storage infrastructure is
>>available.

I suspect my last email to Greg went west, so I'll resend privately.

As to the issue of extra disk space... I've said a few times on the DP 
Forums that after developers the thing which DP lacks most is PPers, 
i.e. the people who take the proofed text & turn it into ebooks. Adding 
extra disk space will solve the problem in the medium term, but in the 
end the PP mountain will just grow higher, until we can match the number 
of posted projects to the number proofed.

I am really hoping that the upcoming site upgrade will help with this 
problem as the extra formatting rounds & open smooth reading pool will 
hopefully make life much easier for PPers.

If you're not a developer, at the moment the best thing you can do for 
DP is to PP or PPV, so we can get projects posted to PG & into the 
archive (i.e. off the production server).

Here's a more detailed post on how people can help - the numbers have 
changed since November & we did some recoding of how images are handled 
to recover some disk space - but essentially the same issue remains:
http://www.pgdp.net/phpBB2/viewtopic.php?p=96304#96304

Thanks,
P
-- 
Help digitise public domain books:
Distributed Proofreaders: http://www.pgdp.net
"Preserving history one page at a time."

Set free dead-tree books:
http://bookcrossing.com/referral/servalan
From jonathan_ingram at yahoo.com  Sat Apr 23 06:25:22 2005
From: jonathan_ingram at yahoo.com (Jonathan Ingram)
Date: Sat Apr 23 06:25:27 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Canadian
	Libraries collection
Message-ID: <20050423132522.83607.qmail@web41705.mail.yahoo.com>

A week ago, DP started systematically working through the Internet Archive's
Canadian Libraries collection. In that week, we have looked at 208 of the 798
books in the archive. Of these, 22 books are identical to books which have
previously been through DP, so will not need to be looked at further, and 18
books have errors (missing or blurred pages) -- the remaining 168 are being
processed and should eventually all move through DP. The aim is to eventually
process *every* book in this collection, and then move on to others.

You can monitor the current progress of our harvesting effort here:
  http://homepage.ntlworld.com/jenjonliz/jon/tia/canadianlibraries.html
And (if you are a DP project manager) claim texts using the following thread in
the DP forum:
  http://www.pgdp.net/phpBB2/viewtopic.php?t=14768

-- 
Jon Ingram


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From scott_bulkmail at productarchitect.com  Sat Apr 23 11:41:33 2005
From: scott_bulkmail at productarchitect.com (Scott Lawton)
Date: Sat Apr 23 11:42:43 2005
Subject: [gutvol-d] suggestions for volunteer page
Message-ID: <p06110425be9043164608@[192.168.0.52]>

i.e. for http://www.gutenberg.org/info/volunteer

This seems like a good place to put requests for missing pages and such.

Incidentally, I just came across a different sort of "missing page" issue: a possible error in a posted text that requires verification against the original scan -- which is (apparently) long gone.

Also, I really hope that PG soon links to scans for all books where scans exist.
-- 

Cheers,

Scott S. Lawton
http://Classicosm.com/ - classic books
http://ProductArchitect.com/ - consulting
From marcello at perathoner.de  Sat Apr 23 14:14:20 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Sat Apr 23 14:14:33 2005
Subject: [gutvol-d] Questions on PG and other ebooks
In-Reply-To: <426A1199.7080903@ar.com.au>
References: <15cfa2a5050419141222f2a8ad@mail.gmail.com>	<42690D42.13117.34C4A1@localhost>
	<426A1199.7080903@ar.com.au>
Message-ID: <426ABAAC.2070706@perathoner.de>

Pauline wrote:

> Here's a more detailed post on how people can help - the numbers have 
> changed since November & we did some recoding of how images are handled 
> to recover some disk space - but essentially the same issue remains:
> http://www.pgdp.net/phpBB2/viewtopic.php?p=96304#96304

Is there a list of missing pages that you don't have to log in to see? 
We could put up a link fron the pg web site.

Ideally the list should be printable and contain exact edition data plus 
the last paragraph of the preceding and the first para of the next page.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From gbuchana at rogers.com  Sun Apr 24 10:42:34 2005
From: gbuchana at rogers.com (Gardner Buchanan)
Date: Sun Apr 24 10:42:51 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Ca
In-Reply-To: <20050423132522.83607.qmail@web41705.mail.yahoo.com>
Message-ID: <XFMail.050424134234.gbuchana@rogers.com>

Hi Jonathan,

I have done some work developing scripts to re-process the page
image sets from the Toronto archive.  If you're interested, maybe
we should compare notes.  I've found the images to be quite high
quality.

You didn't mention reconciling your list with the cleared/books in
progress list, and looking at your web page, I see that you intend to
process at least one of those books in progress - by me - namely:

Macdonald, (Captain) John A
   Troublous Times in Canada: A History of the Fenian Raids of 1866 and 1870
   Clearance OK key=20041231142201macdonald

I've always found it a let down when a book I've laboured over
pops up in PG 3 days before my year-long effort is finally finished.
I have dibs on this: hands off.

Also, your list also has a duplicate entry for this title.


On 13:25:22 Jonathan Ingram wrote:
> A week ago, DP started systematically working through the Internet Archive's
> Canadian Libraries collection. In that week, we have looked at 208 of the 798
> books in the archive. Of these, 22 books are identical to books which have
> previously been through DP, so will not need to be looked at further, and 18
> books have errors (missing or blurred pages) -- the remaining 168 are being
> processed and should eventually all move through DP. The aim is to eventually
> process *every* book in this collection, and then move on to others.
> 
> You can monitor the current progress of our harvesting effort here:
>   http://homepage.ntlworld.com/jenjonliz/jon/tia/canadianlibraries.html
> And (if you are a DP project manager) claim texts using the following thread
> in
> the DP forum:
>   http://www.pgdp.net/phpBB2/viewtopic.php?t=14768
> 
From grythumn at gmail.com  Sun Apr 24 11:39:45 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Sun Apr 24 11:39:53 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Ca
In-Reply-To: <XFMail.050424134234.gbuchana@rogers.com>
References: <20050423132522.83607.qmail@web41705.mail.yahoo.com>
	<XFMail.050424134234.gbuchana@rogers.com>
Message-ID: <15cfa2a50504241139324897c4@mail.gmail.com>

*Interspersed*

On 4/24/05, Gardner Buchanan <gbuchana@rogers.com> wrote:
> Hi Jonathan,
> 
> You didn't mention reconciling your list with the cleared/books in
> progress list, and looking at your web page, I see that you intend to
> process at least one of those books in progress - by me - namely:
> 
> Macdonald, (Captain) John A
>   Troublous Times in Canada: A History of the Fenian Raids of 1866 and 1870

Apparently you didn't read the header at all; these are a list of
_all_ of the books in the Canadian Library section of IA that are in
their catalog. The PM making a claim is responsible for checking
against the In-Progress list; Jon is simply providing a faster method
of reducting duplicate claims than Mr. Price's invaluable monthly
updates.

Similar setups have happened around holidays, where a number of
Christmas or Halloween books may be cleared within days of each other.

> I've always found it a let down when a book I've laboured over
> pops up in PG 3 days before my year-long effort is finally finished.
> I have dibs on this: hands off.

Your tone is a bit harsh. If you bothered to read the header, you
would have seen (by the lack of a color code) that those books are
currently unclaimed by anyone.

A simple note to the thread (I see that you have a DP account) or
privately to Jon, would get them claimed in your name. Actually, while
I am typing this you have done so, although it would have been better
to claim both.

> Also, your list also has a duplicate entry for this title.

That is because the IA has two copies of this title. One taken on
their original setup, and another one taken later, with a different
camera.

>From a post by Molly at IA:
We tagged all of the books with what kind of scanner scanned them.
Here's the key:
Kirtas APT 1200 #1- prototype robot with 8 megapixel camera, originals
shot at around 250DPI, processed images interpolated to 300DPI
(processed will be bigger, around 3MB each).
Kirtas APT 1200 #2- production robot, same kind of camera as #1 
Kirtas APT 1200 #2.5- production robot, new 16megapixel camera.
originals are 500ish DPI (9-10MB each), most of the time processed
images are 300DPI (~2MB each).

R C
From Bowerbird at aol.com  Sun Apr 24 13:20:15 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Sun Apr 24 13:20:32 2005
Subject: [gutvol-d] the .html version of e-text #15701
Message-ID: <1d4.3aaa3df9.2f9d597f@aol.com>


i'm trying to look at the .html version of #15701,
which i downloaded as a zip file to my own machine,
and it seems to require an open internet connection.
it wants to call the w3 or something.  why?

even with such a connection, it won't display in ie5.1.
works in opera 5, but not internet explorer.    why not?

and the .html version of #15698 won't work in either one...

help, anyone?

-bowerbird
From jonathan_ingram at yahoo.com  Sun Apr 24 15:35:07 2005
From: jonathan_ingram at yahoo.com (Jonathan Ingram)
Date: Sun Apr 24 15:35:18 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Ca
In-Reply-To: 6667
Message-ID: <20050424223507.3615.qmail@web41712.mail.yahoo.com>

--- Gardner Buchanan <gbuchana@rogers.com> wrote:
> I have done some work developing scripts to re-process the page
> image sets from the Toronto archive.  If you're interested, maybe
> we should compare notes.  I've found the images to be quite high
> quality.

Almost all of the people working on the Toronto archive, and the other
archive.org page image archives, are using the generated DjVu files, as we
don't have the bandwidth to download half a gig or more of images per book.
These are usually of good enough quality to OCR.

> You didn't mention reconciling your list with the cleared/books in
> progress list

The reconciliation is done by people informing me when material on this list is
already in progress. It might be possible to do some of this by automatically
comparing David's In Progress List with this list, but nothing along those
lines has yet been done. It's up to the individuals who claim books from this
list to check their status.

, and looking at your web page, I see that you intend to
> process at least one of those books in progress - by me - namely:
> <snip>
> Also, your list also has a duplicate entry for this title.

Thanks for reporting these in the thread on the DP forum. Both entries have
been marked as already in progress.

-- 
Jon Ingram


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From jtinsley at pobox.com  Sun Apr 24 16:27:15 2005
From: jtinsley at pobox.com (Jim Tinsley)
Date: Sun Apr 24 16:27:28 2005
Subject: [gutvol-d] the .html version of e-text #15701
In-Reply-To: <1d4.3aaa3df9.2f9d597f@aol.com>
References: <1d4.3aaa3df9.2f9d597f@aol.com>
Message-ID: <20050424232715.GA24277@panix.com>

On Sun, Apr 24, 2005 at 04:20:15PM -0400, Bowerbird@aol.com wrote:
>
>i'm trying to look at the .html version of #15701,
>which i downloaded as a zip file to my own machine,
>and it seems to require an open internet connection.
>it wants to call the w3 or something.  why?
>

I don't know. It doesn't for me. It's conceivable--just--
that your browser is trying to pre-fetch the W3 DTD
as defined in the DOCTYPE declaration, but it's the
first time I ever heard of something like that happening.
And the same declaration is in lots of texts; nothing
new or strange about this one.

I'm glad you mentioned it, though, because when I posted
it last night, I made a final one-character change and
instead of copying it to /gut, I copied it to gut, which
resulted in my uploading both the real file and a copy
of it with the change named "gut". I've fixed and
reuploaded.

>even with such a connection, it won't display in ie5.1.
>works in opera 5, but not internet explorer.    why not?

I have really given up trying to figure out why IE, any
version, doesn't work right, so you're on your own there!
I can tell you that it displays fine in my IE 6, Mozilla,
K-Meleon . . . even Lynx. :-)

>
>and the .html version of #15698 won't work in either one...
>

That one is more interesting; it doesn't have a terminating
HTML comment mark after the <style>. However, the W3C
validators have no problem with it, and the parse tree is
recognized, and I've given up trying to track all the ways
that foreign command-sets or languages can be embedded in
HTML. Maybe a newer browser will help. What is Opera on now? 8?

jim

From shimmin at uiuc.edu  Mon Apr 25 06:27:35 2005
From: shimmin at uiuc.edu (Robert Shimmin)
Date: Mon Apr 25 06:27:44 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Ca
In-Reply-To: <XFMail.050424134234.gbuchana@rogers.com>
References: <XFMail.050424134234.gbuchana@rogers.com>
Message-ID: <426CF047.5050002@uiuc.edu>

> I've always found it a let down when a book I've laboured over
> pops up in PG 3 days before my year-long effort is finally finished.
> I have dibs on this: hands off.

Despair not, but instead Rejoice! Having two independent transcriptions 
is a great way to catch errors, since two transcribers working from 
different copies of the book, by different methods, are unlikely to make 
the same error.  Assuming there aren't major textual differences, 
diffing two independent transcriptions and checking the discrepancies is 
one of the best way to get our quality up to five or even six nines.

-- RS
From jonathan_ingram at yahoo.com  Mon Apr 25 07:14:16 2005
From: jonathan_ingram at yahoo.com (Jonathan Ingram)
Date: Mon Apr 25 07:14:24 2005
Subject: [gutvol-d] Update on Harvesting of the Internet Archive's Ca
In-Reply-To: 6667
Message-ID: <20050425141416.92344.qmail@web41703.mail.yahoo.com>


--- Robert Shimmin <shimmin@uiuc.edu> wrote:
> > I've always found it a let down when a book I've laboured over
> > pops up in PG 3 days before my year-long effort is finally finished.
> > I have dibs on this: hands off.
> 
> Despair not, but instead Rejoice! Having two independent transcriptions 
> is a great way to catch errors, since two transcribers working from 
> different copies of the book, by different methods, are unlikely to make 
> the same error.  Assuming there aren't major textual differences, 
> diffing two independent transcriptions and checking the discrepancies is 
> one of the best way to get our quality up to five or even six nines.

Very true. Although we're not aiming to redo books that people are currently
working on, I see nothing wrong with running books through DP which are already
in PG. There are several reasons for this, including error correction, and the
creation of HTML editions, which are basically standard these days for
DP-produced texts.

-- 
Jon Ingram


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From lee at novomail.net  Tue Apr 26 13:29:09 2005
From: lee at novomail.net (Lee Passey)
Date: Tue Apr 26 13:30:44 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <20050425190003.8CBDA8C8EF@pglaf.org>
References: <20050425190003.8CBDA8C8EF@pglaf.org>
Message-ID: <426EA495.5010905@novomail.net>

gutvol-d-request@lists.pglaf.org wrote:

>  On Sun, Apr 24, 2005 at 04:20:15PM -0400, Bowerbird@aol.com wrote:
>
> > i'm trying to look at the .html version of #15701, which i
> > downloaded as a zip file to my own machine, and it seems to require
> > an open internet connection. it wants to call the w3 or something.
> > why?
>
>  I don't know. It doesn't for me. It's conceivable--just-- that your
>  browser is trying to pre-fetch the W3 DTD as defined in the DOCTYPE
>  declaration, but it's the first time I ever heard of something like
>  that happening. And the same declaration is in lots of texts; nothing
>  new or strange about this one.

Internet Explorer 5.5 is the only browser I have for which my firewall
is configured to prevent outgoing access without permission. Opening
this file in IE 5.5 does not create any outgoing connections.
Examination of the file reveals that there are no resources referenced
in the file external to the file itself except: 1. the DTD declaration,
and 2. an image of the Burke coat of arms. Given the fact that your
browser is attempting to contact the W3C (the "owners" of the XHTML DTD)
I would agree with Mr. Tinsley that your browser seems to be attempting
to fetch the declared DTD. In fact, given that Opera seems to have
fairly good support for most XML vocabularies other than XHTML, I would
bet that you're seeing this behavior when using the Opera browser.

When you refuse the outgoing connection, is the document displayed
anyway? (not that it will make any difference, but I _am_ curious).

[snip]

> > and the .html version of #15698 won't work in either one...
> >
>
>  That one is more interesting; it doesn't have a terminating HTML
>  comment mark after the <style>. However, the W3C validators have no
>  problem with it, and the parse tree is recognized, and I've given up
>  trying to track all the ways that foreign command-sets or languages
>  can be embedded in HTML. Maybe a newer browser will help. What is
>  Opera on now? 8?

The problem is, indeed, the unterminated comment. The XHTML DTD defines
the <style> element as containing #PCDATA, which is to say textual
'stuff' which may or may not be HTML. An HTML User Agent should _not_
attempt to parse any of the data between <style> and </style>, but
should pass that text on to the stylesheet parser.

It has become common to embed an internal style declaration inside HTML
comments (<!--  -->) for compatibility with older browsers which did not
support style sheets.  If a browser did not support style sheets it
would encounter the <style> tag and ignore it, as all good browsers are
designed to do. It would then encounter the HTML comment tag and ignore
everything until the closing tag was encountered. That way the browser
wouldn't display the style definitions as just more text. On the other
hand, stylesheet parsers are designed to ignore the comment tags
themselves, so all the stylesheet goodness is visible to a stylesheet
parser.

While the lack of a closing comment tag in the <style> element is a bug
in the document, the failure of your browser(s) to ignore comment tags
in a <style> element is also a bug in those programs. While I don't have
a working installation of IE prior to 5.5, which does _not_ have this
problem, the problem also presents itself in Opera 7.11, but has been
fixed in Opera 7.51. My experience has been that in the past Opera has
been somewhat slavishly devoted to mimicing the behavior of IE, even
when that behavior is contrary to internet standards (Javascript
implementations come to mind). It is therefore not surprising that early
versions of Opera should have the same behavior as early versions of IE.

Despite the bugginess of your browsers, the HTML text at Project
Gutenberg really should be fixed, as this will cause the failure to
display the text in any browser which does not support the <style> element.

Because the contents of a <style> element is #PCDATA, HTML validators
will generally not be able to catch this type of error. I have examined
the source code for HTML Tidy, and when it encounters a <style> tag it
simple creates a text node for the entire text up to the </style> tag.
No validation of the actual style sheets is performed. I suspect that
the W3C validator operates the same way. Validators are good tools, but
satisfying a validator does not mean that the HTML is, in fact, valid --
only that there are no errors of the type that the validators are
designed to catch.

On a related note, let me say that I view internal style declarations as
just plain rude. Style sheets are indeed A Good Thing, but someone
imposing their quirky notions of style on me is not. By placing style
definitions in an external style sheet and simply linking that style
sheet into the main document with a <link> element, it makes it easy for
me to strip away the suggested styles, and return to browser defaults,
by simply deleting or renaming the style sheet. And if the suggested
styles are mostly good, and need only a slight tweaking, it is safer and
easier to edit an external style sheet than the main document. I would
strongly encourage all PG volunteers who are creating HTML documents to
consider putting suggested style definitions in an external style sheet
rather than embedding those styles in the main document.


From joshua at hutchinson.net  Tue Apr 26 14:04:06 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Tue Apr 26 14:04:16 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
Message-ID: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>


----- Original Message -----
From: "Lee Passey" <lee@novomail.net>
> 
> On a related note, let me say that I view internal style declarations as
> just plain rude. Style sheets are indeed A Good Thing, but someone
> imposing their quirky notions of style on me is not. By placing style
> definitions in an external style sheet and simply linking that style
> sheet into the main document with a <link> element, it makes it easy for
> me to strip away the suggested styles, and return to browser defaults,
> by simply deleting or renaming the style sheet. And if the suggested
> styles are mostly good, and need only a slight tweaking, it is safer and
> easier to edit an external style sheet than the main document. I would
> strongly encourage all PG volunteers who are creating HTML documents to
> consider putting suggested style definitions in an external style sheet
> rather than embedding those styles in the main document.
> 

Let me say that I agree.  But right now, the ww'ers have indicated that they want all styles inline.  I think this practice should be changed, but it isn't my call.

Josh
From marcello at perathoner.de  Tue Apr 26 14:43:06 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Tue Apr 26 14:43:18 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <426EA495.5010905@novomail.net>
References: <20050425190003.8CBDA8C8EF@pglaf.org>
	<426EA495.5010905@novomail.net>
Message-ID: <426EB5EA.5020904@perathoner.de>

Lee Passey wrote:

> The problem is, indeed, the unterminated comment. The XHTML DTD defines
> the <style> element as containing #PCDATA, which is to say textual
> 'stuff' which may or may not be HTML.

The problem cannot be the "unterminated comment" because the "comment" 
is no comment at all.

Let's recap:

File

   http://www.gutenberg.org/files/15698/15698-h/15698-h.htm

has a doctype of

   <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
                         "http://www.w3.org/TR/html4/loose.dtd">

So lets take a look at the *HTML 4.01* specs (not the *XHTML* specs):

   <!ENTITY % StyleSheet "CDATA" -- style sheet data -->

   <!ELEMENT STYLE - - %StyleSheet        -- style info -->

The STYLE element contains CDATA, which is not parsed. In CDATA < has no 
special meaning at all and cannot therefore start a comment.


 > An HTML User Agent should _not_
 > attempt to parse any of the data between <style> and </style>, but
 > should pass that text on to the stylesheet parser.

A user agent that *knows* about style sheets will not. A user agent 
developed before CSS will just ignore the style tags but will process 
the data in between.


> While the lack of a closing comment tag in the <style> element is a bug
> in the document, the failure of your browser(s) to ignore comment tags
> in a <style> element is also a bug in those programs.

No bug. The program is simply to old and decrepit to know anything about 
style sheets. It skips the opening style tag (as it should as of the 
HTML standard before style sheets) and continues parsing, because it 
doesn't know about the contents of style being CDATA. It then finds an 
opening comment tag and that's all its gonna see for a long long time 
because the closing -- is found way down in the license.


> Because the contents of a <style> element is #PCDATA, HTML validators
> will generally not be able to catch this type of error. 

It is not! (People should really read the specs!) If it were #PCDATA 
(*Parsed* Character DATA) the validator would find the error because it 
would parse it. Its because its CDATA that the validator doesn't parse 
it and in consequence cannot find the "error". In CDATA < has no special 
meaning at all.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From hacker at gnu-designs.com  Tue Apr 26 15:00:17 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Tue Apr 26 15:01:15 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
Message-ID: <Pine.LNX.4.62.0504261757450.7291@angst.gnu-designs.com>


> Let me say that I agree.  But right now, the ww'ers have indicated 
> that they want all styles inline.  I think this practice should be 
> changed, but it isn't my call.

	This isn't a boolean decision and there are reasons for and 
against inline styles vs. external styles. If you have an extremely 
heavy hit site (like I do), inline styles makes sense... _except_ when 
your stylesheet itself is overly large (hundreds of lines or more).

	The balance has to be consired: Is it worth it to incur 
another socket connection and round trip from the client (more server 
hits to retrieve resources) or is it better to have a larger 
byte-per-hit (but less socket connections) per-client?

	Every site has their own requirements, and for some you'll 
want to use both approaches. Its not an all-or-nothing decision.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From jon at noring.name  Tue Apr 26 15:15:06 2005
From: jon at noring.name (Jon Noring)
Date: Tue Apr 26 15:15:35 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS
In-Reply-To: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
Message-ID: <1429845873.20050426161506@noring.name>

Joshua wrote:
> Lee Passey wrote:

>> On a related note, let me say that I view internal style declarations as
>> just plain rude. Style sheets are indeed A Good Thing, but someone
>> imposing their quirky notions of style on me is not. By placing style
>> definitions in an external style sheet and simply linking that style
>> sheet into the main document with a <link> element, it makes it easy for
>> me to strip away the suggested styles, and return to browser defaults,
>> by simply deleting or renaming the style sheet. And if the suggested
>> styles are mostly good, and need only a slight tweaking, it is safer and
>> easier to edit an external style sheet than the main document. I would
>> strongly encourage all PG volunteers who are creating HTML documents to
>> consider putting suggested style definitions in an external style sheet
>> rather than embedding those styles in the main document.

> Let me say that I agree.  But right now, the ww'ers have
> indicated that they want all styles inline.  I think this practice
> should be changed, but it isn't my call.

I'm a little puzzled by this because it implies there is no
"standardization" of the HTML markup.

I think the XHTML markup should be standardized enough around a structural/
semantic basis (not a presentational basis) so that a standardized style
sheet can be used for most of the books in the proofing process.

In XHTML this can be a standardized "class" library mapped from TEI
tags (for a possible flavor of TEI to use, see Marcello's lastest draft
of PGTEI at

   http://www.gutenberg.org/tei/marcello/0.3/doc/20000-h/20000-h.html )


Jon Noring


From marcello at perathoner.de  Tue Apr 26 15:31:11 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Tue Apr 26 15:31:24 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <Pine.LNX.4.62.0504261757450.7291@angst.gnu-designs.com>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
	<Pine.LNX.4.62.0504261757450.7291@angst.gnu-designs.com>
Message-ID: <426EC12F.3020607@perathoner.de>

David A. Desrosiers wrote:

> 	This isn't a boolean decision and there are reasons for and 
> against inline styles vs. external styles. If you have an extremely 
> heavy hit site (like I do), inline styles makes sense... _except_ when 
> your stylesheet itself is overly large (hundreds of lines or more).
> 
> 	The balance has to be consired: Is it worth it to incur 
> another socket connection and round trip from the client (more server 
> hits to retrieve resources) or is it better to have a larger 
> byte-per-hit (but less socket connections) per-client?

If connections are your concern why dont you use keep-alive connections 
on your site? Modern UAs and webservers can download an HTML page with 
CSS and all images in one connection:


$ wget -S --mirror http://www.gutenberg.org/files/15701/15701-h/15701-h.htm

--00:23:03--  http://www.gutenberg.org/files/15701/15701-h/15701-h.htm
            => `www.gutenberg.org/files/15701/15701-h/15701-h.htm'
Resolving www.gutenberg.org... 152.2.210.81
Connecting to www.gutenberg.org[152.2.210.81]:80... connected.
HTTP request sent, awaiting response...
  1 HTTP/1.1 200 OK
  2 Date: Tue, 26 Apr 2005 22:23:04 GMT
  3 Server: Apache/1.3.33 (Unix) PHP/4.3.10
  4 Last-Modified: Sun, 24 Apr 2005 19:59:32 GMT
  5 ETag: "5de5d6-f4c14-426bfaa4"
  6 Accept-Ranges: bytes
  7 Content-Length: 1002516
  8 Keep-Alive: timeout=5, max=20
  9 Connection: Keep-Alive
10 Content-Type: text/html

--00:23:28--  http://www.gutenberg.org/robots.txt
            => `www.gutenberg.org/robots.txt'
Reusing connection to www.gutenberg.org:80.
HTTP request sent, awaiting response...
  1 HTTP/1.1 200 OK
  2 Date: Tue, 26 Apr 2005 22:23:28 GMT
  3 Server: Apache/1.3.33 (Unix) PHP/4.3.10
  4 Last-Modified: Tue, 19 Apr 2005 12:46:20 GMT
  5 ETag: "3e587-161-4264fd9c"
  6 Accept-Ranges: bytes
  7 Content-Length: 353
  8 Keep-Alive: timeout=5, max=19
  9 Connection: Keep-Alive
10 Content-Type: text/plain

--00:23:28--  http://www.gutenberg.org/files/15701/15701-h/images/001.png
            => `www.gutenberg.org/files/15701/15701-h/images/001.png'
Reusing connection to www.gutenberg.org:80.
HTTP request sent, awaiting response...
  1 HTTP/1.1 200 OK
  2 Date: Tue, 26 Apr 2005 22:23:28 GMT
  3 Server: Apache/1.3.33 (Unix) PHP/4.3.10
  4 Cache-Control: max-age=86400
  5 Expires: Wed, 27 Apr 2005 22:23:28 GMT
  6 Last-Modified: Sun, 24 Apr 2005 19:59:32 GMT
  7 ETag: "5de5d8-3575-426bfaa4"
  8 Accept-Ranges: bytes
  9 Content-Length: 13685
10 Keep-Alive: timeout=5, max=18
11 Connection: Keep-Alive
12 Content-Type: image/png


-- 
Marcello Perathoner
webmaster@gutenberg.org

From joshua at hutchinson.net  Tue Apr 26 15:58:33 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Tue Apr 26 15:58:46 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <1429845873.20050426161506@noring.name>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
	<1429845873.20050426161506@noring.name>
Message-ID: <426EC799.4070504@hutchinson.net>

Jon Noring wrote:

>I'm a little puzzled by this because it implies there is no
>"standardization" of the HTML markup.
>
>I think the XHTML markup should be standardized enough around a structural/
>semantic basis (not a presentational basis) so that a standardized style
>sheet can be used for most of the books in the proofing process.
>
>In XHTML this can be a standardized "class" library mapped from TEI
>tags (for a possible flavor of TEI to use, see Marcello's lastest draft
>of PGTEI at
>
>   http://www.gutenberg.org/tei/marcello/0.3/doc/20000-h/20000-h.html )
>
>  
>
The "standardization" is there ... it just doesn't go as far as to 
specify a standard style sheet.

Now, the TEI has a "working" standard style sheet, but there have 
already been some changes identified in testing.

Once, we have the final transforms worked out, I plan on having an open 
call for style sheets on DP.

Josh
From nwolcott at dsdial.net  Tue Apr 26 17:16:39 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Tue Apr 26 17:17:08 2005
Subject: [gutvol-d] Dict of National Biography
Message-ID: <002b01c54abe$680496a0$269495ce@gw98>

The other day I went to the local library to consult their copy of the Dictionary of National Biography, 22 Vol 1921. It was no longer on the shelf. Upon enquiry, I was informed that it was library policy to surplus all material which might be replaced by an electronic resource: in this case the Gale Biographical Database. This database gives old entries from Who's who. Not at all the same. It was given to their surplus book sale. I note it costs $3000 on Abe. And of course when the books are gone there are no more to replace them. (well there may have been a reprint or two). Also the "books were old and out of date" were "not in use" . The reference books in use were "All the plots of Shakespeare's Plays", "Condensed 100  novels with plot lines" etc. I suggested they were encouraging theft as a mode of preservation. They were not amused. Disposal was up to the local librarian, in this case ssomeone whom I believe probably thought there was no need for English History anyway. 


N Wolcott  nwolcott2@post.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050426/19fa41b7/attachment-0001.html
From geoff.horton at gmail.com  Tue Apr 26 17:49:59 2005
From: geoff.horton at gmail.com (Geoff Horton)
Date: Tue Apr 26 17:50:12 2005
Subject: [gutvol-d] Dict of National Biography
In-Reply-To: <002b01c54abe$680496a0$269495ce@gw98>
References: <002b01c54abe$680496a0$269495ce@gw98>
Message-ID: <94e5f59605042617497035a2f9@mail.gmail.com>

Have you read Connie Willis's _Bellwether_?

On 4/26/05, N Wolcott <nwolcott@dsdial.net> wrote:
>  
>  Also the
> "books were old and out of date" were "not in use" . The reference books in
> use were "All the plots of Shakespeare's Plays", "Condensed 100  novels with
> plot lines" etc. I suggested they were encouraging theft as a mode of
> preservation. They were not amused. Disposal was up to the local librarian,
> in this case ssomeone whom I believe probably thought there was no need for
> English History anyway.
From hacker at gnu-designs.com  Tue Apr 26 19:56:37 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Tue Apr 26 19:57:11 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <426EC12F.3020607@perathoner.de>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
	<Pine.LNX.4.62.0504261757450.7291@angst.gnu-designs.com>
	<426EC12F.3020607@perathoner.de>
Message-ID: <Pine.LNX.4.62.0504262254260.9174@angst.gnu-designs.com>


> If connections are your concern why dont you use keep-alive 
> connections on your site? Modern UAs and webservers can download an 
> HTML page with CSS and all images in one connection:

	Because KeepAlive hurts performance on heavily-loaded servers, 
and because there are lots of exploits running about specifically used 
to tie up webservers that use KeepAlive by leaving each socket in a 
TIME_WAIT state. I've had to use the TARPIT module in iptables to work 
around some of it over the last two weeks myself (works great!).

	With Apache Benchmark pounding various pages on the same 
physical box, with KeepAlive enabled, performance drops by about 80% 
(we're behind Squid as an http accellerator anyway, so KeepAlive on 
the Apache side is moot).


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From jon at noring.name  Tue Apr 26 20:20:35 2005
From: jon at noring.name (Jon Noring)
Date: Tue Apr 26 20:21:34 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS
In-Reply-To: <426EC799.4070504@hutchinson.net>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>
	<1429845873.20050426161506@noring.name>
	<426EC799.4070504@hutchinson.net>
Message-ID: <1235253024.20050426212035@noring.name>

Josh replied:
> Jon Noring wrote:

>> I'm a little puzzled by this because it implies there is no
>> "standardization" of the HTML markup.
>>
>> I think the XHTML markup should be standardized enough around a structural/
>> semantic basis (not a presentational basis) so that a standardized style
>> sheet can be used for most of the books in the proofing process.
>>
>> In XHTML this can be a standardized "class" library mapped from TEI
>> tags (for a possible flavor of TEI to use, see Marcello's lastest draft
>> of PGTEI at
>>
>> http://www.gutenberg.org/tei/marcello/0.3/doc/20000-h/20000-h.html )

> The "standardization" is there ... it just doesn't go as far as to 
> specify a standard style sheet.

Ah, ok, so I was probably preaching to the choir.


> Now, the TEI has a "working" standard style sheet, but there have 
> already been some changes identified in testing.
>
> Once, we have the final transforms worked out, I plan on having an open
> call for style sheets on DP.

Good to hear!

I suggest, when that time comes, to post a request to the YahooGroup
"CSS-Style". There's a number of CSS experts there who may
consider helping PG/DP out with alternative CSS style sheets. One can't
have too many different ways to view the same content. (This is the next
thing I plan with the "My Antonia" proof-of-concept project -- to ask
for alternative style sheets for the document. Let the end-user try
different layouts and choose the one they prefer.)

Jon Noring

From hyphen at hyphenologist.co.uk  Wed Apr 27 00:34:32 2005
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Wed Apr 27 00:35:05 2005
Subject: [gutvol-d] Dict of National Biography
In-Reply-To: <002b01c54abe$680496a0$269495ce@gw98>
References: <002b01c54abe$680496a0$269495ce@gw98>
Message-ID: <2kfu61ltvr766oj51i3l82opc93hr6pnpu@4ax.com>

On Tue, 26 Apr 2005 20:16:39 -0400,  "N Wolcott" <nwolcott@dsdial.net>
wrote:

| The other day I went to the local library to consult their copy of the Dictionary of National Biography, 22 Vol 1921. It was no longer on the shelf. 

In the UK the Libraries have a system to ensure that *some* paper copies of
books are preserved and available at least for reference.   The vast
majority are also available to borrow via the British Library, you just
have to fill in a form, and wait.   Local poets and writers can *always* be
found in the Local Studies section of libraries, often borrowing copies.

-- 
Dave Fawthrop <dave hyphenologist co uk> 
Killfile and Anti Troll FAQs at 
http://www.hyphenologist.co.uk/killfile. 

From collin at xs4all.nl  Wed Apr 27 02:28:22 2005
From: collin at xs4all.nl (Branko Collin)
Date: Wed Apr 27 02:16:44 2005
Subject: [gutvol-d] Dict of National Biography
In-Reply-To: <002b01c54abe$680496a0$269495ce@gw98>
Message-ID: <426F7756.28676.3855FC@localhost>

On 26 Apr 2005, at 20:16, N Wolcott wrote:

> The other day I went to the local library to consult their copy of the
> Dictionary of National Biography, 22 Vol 1921. It was no longer on the
> shelf. Upon enquiry, I was informed that it was library policy to
> surplus all material which might be replaced by an electronic
> resource: in this case the Gale Biographical Database. This database
> gives old entries from Who's who. Not at all the same. It was given to
> their surplus book sale. I note it costs $3000 on Abe. And of course
> when the books are gone there are no more to replace them. (well there
> may have been a reprint or two). Also the "books were old and out of
> date" were "not in use" . The reference books in use were "All the
> plots of Shakespeare's Plays", "Condensed 100  novels with plot lines"
> etc. I suggested they were encouraging theft as a mode of
> preservation. They were not amused. Disposal was up to the local
> librarian, in this case ssomeone whom I believe probably thought there
> was no need for English History anyway. 

Is this a question, a rant, an excercise in writing long paragraphs?

-- 
branko collin
collin@xs4all.nl
From marcello at perathoner.de  Wed Apr 27 04:58:24 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Apr 27 04:58:48 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <Pine.LNX.4.62.0504262254260.9174@angst.gnu-designs.com>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>	<Pine.LNX.4.62.0504261757450.7291@angst.gnu-designs.com>	<426EC12F.3020607@perathoner.de>
	<Pine.LNX.4.62.0504262254260.9174@angst.gnu-designs.com>
Message-ID: <426F7E60.90704@perathoner.de>

David A. Desrosiers wrote:

>>If connections are your concern why dont you use keep-alive 
>>connections on your site? Modern UAs and webservers can download an 
>>HTML page with CSS and all images in one connection:
> 
> 	Because KeepAlive hurts performance on heavily-loaded servers, 
> and because there are lots of exploits running about specifically used 
> to tie up webservers that use KeepAlive by leaving each socket in a 
> TIME_WAIT state.

Why write such a complicated exploit when just opening the connection 
and sending nothing is much simpler? This will leave the connection in 
an ESTABLISHED state, but will tie up one apache child all the same 
(until TimeOut).

And it will need much less bandwidth than your exploit: The default 
value of TimeOut is 300 while the default value of KeepAliveTimeout is 15.


> 	With Apache Benchmark pounding various pages on the same 
> physical box, with KeepAlive enabled, performance drops by about 80% 
> (we're behind Squid as an http accellerator anyway, so KeepAlive on 
> the Apache side is moot).

But how is your exploit going to work if you have a squid in front? 
Doesn't the squid close the connection?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From shimmin at uiuc.edu  Wed Apr 27 05:56:59 2005
From: shimmin at uiuc.edu (Robert Shimmin)
Date: Wed Apr 27 05:57:02 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <426EC799.4070504@hutchinson.net>
References: <20050426210406.16234EE13D@ws6-1.us4.outblaze.com>	<1429845873.20050426161506@noring.name>
	<426EC799.4070504@hutchinson.net>
Message-ID: <426F8C1B.2050902@uiuc.edu>

> The "standardization" is there ... it just doesn't go as far as to 
> specify a standard style sheet.
> 
> Now, the TEI has a "working" standard style sheet, but there have 
> already been some changes identified in testing.
> 
> Once, we have the final transforms worked out, I plan on having an open 
> call for style sheets on DP.

I've worked on enough 'quirky' projects to think this a bad idea.  The 
typesetters of the past did things that often make it a complete 
judgment call where the content stops and the presentation begins, and 
many projects will not fit on someone else's Procrustean notions of how 
a PG project 'should' be styled.

-- RS
From joshua at hutchinson.net  Wed Apr 27 06:08:56 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Wed Apr 27 06:08:59 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
Message-ID: <20050427130856.85888109967@ws6-4.us4.outblaze.com>


----- Original Message -----
From: "Robert Shimmin" <shimmin@uiuc.edu>
> 
> > The "standardization" is there ... it just doesn't go as far as to specify a 
> > standard style sheet.
> >
> > Now, the TEI has a "working" standard style sheet, but there have already 
> > been some changes identified in testing.
> >
> > Once, we have the final transforms worked out, I plan on having an open call 
> > for style sheets on DP.
> 
> I've worked on enough 'quirky' projects to think this a bad idea.  The 
> typesetters of the past did things that often make it a complete judgment call 
> where the content stops and the presentation begins, and many projects will 
> not fit on someone else's Procrustean notions of how a PG project 'should' be 
> styled.
> 

>From my experience, this is not the case and I've put together some very quirky ones into TEI already.  I've picked one text with over 1,000 footnotes and over 500 sidenotes.  It is complicated, but it renders beautifully.  I've picked another text that uses editorial sidenotes running throughout a Middle English poem.

Both of these were great test cases that helped us identify bugs in the transform.

Now, this isn't to say the layout looks exactly like the original.  It does not in many cases.

A sidenote in the margin is functionally the same as a sidenote that is floated as an inset on the side of the page.  Worrying that the presentation of one matches the original book and one doesn't is irrelevant.  The publisher probably didn't slavishly follow the layout the author wrote his original manuscript in either.

Now, do I agree that we will eventually find something that goes beyond what we've covered in the TEI spec?  You better believe we will.  But when that day comes, TEI can be expanded to cover the situation.  Do I think this will be a common occurance?  Not a chance.

Josh
From joshua at hutchinson.net  Wed Apr 27 06:13:18 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Wed Apr 27 06:13:20 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
Message-ID: <20050427131318.1E626109989@ws6-4.us4.outblaze.com>

I apologize.  The last paragraph attributed to Robert below is actually mine.

* Bad mail editor.  Bad.  Go to your room. *

Josh

----- Original Message -----
From: "Joshua Hutchinson" <joshua@hutchinson.net>
> 
> 
> ----- Original Message -----
> From: "Robert Shimmin" <shimmin@uiuc.edu>
> >
> > > The "standardization" is there ... it just doesn't go as far as to specify 
> > a > standard style sheet.
> > >
> > > Now, the TEI has a "working" standard style sheet, but there have already 
> > > been some changes identified in testing.
> > >
> > > Once, we have the final transforms worked out, I plan on having an open 
> > call > for style sheets on DP.
> >
> > I've worked on enough 'quirky' projects to think this a bad idea.  The 
> > typesetters of the past did things that often make it a complete judgment 
> > call where the content stops and the presentation begins, and many projects 
> > will not fit on someone else's Procrustean notions of how a PG project 
> > 'should' be styled.
> >
> 
> > From my experience, this is not the case and I've put together some very 
> > quirky ones into TEI already.  I've picked one text with over 1,000 
> > footnotes and over 500 sidenotes.  It is complicated, but it renders 
> > beautifully.  I've picked another text that uses editorial sidenotes running 
> > throughout a Middle English poem.
> 
> Both of these were great test cases that helped us identify bugs in the 
> transform.
> 
> Now, this isn't to say the layout looks exactly like the original.  It does 
> not in many cases.
> 
> A sidenote in the margin is functionally the same as a sidenote that is 
> floated as an inset on the side of the page.  Worrying that the presentation 
> of one matches the original book and one doesn't is irrelevant.  The publisher 
> probably didn't slavishly follow the layout the author wrote his original 
> manuscript in either.
> 
> Now, do I agree that we will eventually find something that goes beyond what 
> we've covered in the TEI spec?  You better believe we will.  But when that day 
> comes, TEI can be expanded to cover the situation.  Do I think this will be a 
> common occurance?  Not a chance.
> 
> Josh
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From shimmin at uiuc.edu  Wed Apr 27 06:33:10 2005
From: shimmin at uiuc.edu (Robert Shimmin)
Date: Wed Apr 27 06:33:13 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <20050427130856.85888109967@ws6-4.us4.outblaze.com>
References: <20050427130856.85888109967@ws6-4.us4.outblaze.com>
Message-ID: <426F9496.5080808@uiuc.edu>

> Now, do I agree that we will eventually find something that goes beyond
> what we've covered in the TEI spec?  You better believe we will.  But when
> that day comes, TEI can be expanded to cover the situation.  Do I think this
> will be a common occurance?  Not a chance.

This is just off the top of the stack of projects I would like to put 
through DP, but hold off on because they contain elements I consider 
presently unmanageable.

http://www.ews.uiuc.edu/~shimmin/almanac.PNG

Today, if I could figure out a good way to handle it, I would write some 
styling for it, submit it, and be done with it.  Tomorrow, I will have 
to bug some element of the PG bureaucracy to patch my styling into their 
code, which it may not be compatible with, or may disagree with their 
notions of markup elegance.  A tool that works 90% of the time is a 
wonderful thing.  A rule that works 90% of the time is not.

-- RS
From marcello at perathoner.de  Wed Apr 27 07:28:14 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Apr 27 07:28:21 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
In-Reply-To: <426F9496.5080808@uiuc.edu>
References: <20050427130856.85888109967@ws6-4.us4.outblaze.com>
	<426F9496.5080808@uiuc.edu>
Message-ID: <426FA17E.7080304@perathoner.de>

Robert Shimmin wrote:

> This is just off the top of the stack of projects I would like to put 
> through DP, but hold off on because they contain elements I consider 
> presently unmanageable.
> 
> http://www.ews.uiuc.edu/~shimmin/almanac.PNG

I don't see any problem with this. The text inside the wheel can easily 
be encoded in TEI and the whole thing can be added as illustration.

OTOH if you want to get fancy you can encode the wheel in SVG and embed 
the SVG in TEI. Yes, you _can_ and we _already_ support that. If the 
browser groks SVG, the SVG code will be rendered by the browser, if not 
a pre-rendered image will be displayed instead.

So it seems to me, that SVG/TEI is an even better choice than SVG/XHTML, 
because it saves you the trouble to create the fallback image manually.


Thinking ahead, on the day when all PG TEI files will be stored in an 
XML database your readers will be able to do some fancy queries like: 
"show me all the texts which reference the date: Feb 24". If you have 
encoded the date like this:

   <date value="1555-02-24" rend="italic"><name reg="Matthias, St.">S. 
Mathies</name> day</date>

the user will find your text. In a HTML file it will not find that date 
(not even the Saint).


> Today, if I could figure out a good way to handle it, I would write some 
> styling for it, submit it, and be done with it.  Tomorrow, I will have 
> to bug some element of the PG bureaucracy to patch my styling into their 
> code, which it may not be compatible with, or may disagree with their 
> notions of markup elegance.

Wrong.

If you want to do your project in TEI, fine. Into the bargain you'll get 
HTML, TXT and PDF output from a single file, and more formats will follow.

If you want to stick to the status quo, also fine. You'll have to do 
HTML and TXT manually and will nearly double your work and the work of 
the maintainers but you'll get a slightly better-looking text (in your 
opinion, that is).

Sadly, people today take more pride in a good-looking text than in an 
error-free and usable one.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From joshua at hutchinson.net  Wed Apr 27 08:23:05 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Wed Apr 27 08:23:09 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
Message-ID: <20050427152305.B96302FA99@ws6-3.us4.outblaze.com>


----- Original Message -----
From: "Robert Shimmin" <shimmin@uiuc.edu>
> 
> > Now, do I agree that we will eventually find something that goes beyond
> > what we've covered in the TEI spec?  You better believe we will.  But when
> > that day comes, TEI can be expanded to cover the situation.  Do I think this
> > will be a common occurance?  Not a chance.
> 
> This is just off the top of the stack of projects I would like to put through 
> DP, but hold off on because they contain elements I consider presently 
> unmanageable.
> 
> http://www.ews.uiuc.edu/~shimmin/almanac.PNG
> 
> Today, if I could figure out a good way to handle it, I would write some 
> styling for it, submit it, and be done with it.  Tomorrow, I will have to bug 
> some element of the PG bureaucracy to patch my styling into their code, which 
> it may not be compatible with, or may disagree with their notions of markup 
> elegance.  A tool that works 90% of the time is a wonderful thing.  A rule 
> that works 90% of the time is not.
> 

I can think of a few ways to do this.  The easiest (and how I'd probably do it if I was in a hurry), is to just insert the whole figure as an image.

<p rend="center">
<figure url="images/image01.png"><figDesc>A Table for the Sodaies letter and Leapeyear.</figDesc></figure></p>

If I was a little more energetic (but still fairly lazy), I'd insert the circular part of the figure as an image with the text cut out and added as a blurb below it.

<p rend="center">
<figure url="images/image01.png"><figDesc>A Table for the Sodaies letter and Leapeyear.</figDesc></figure></p>

<p rend="center">A Table for the Sodaies letter and Leapeyear. blah blah blah</p>

If I was REALLY energetic, I'd redo the graphic in SVG (a vector graphic image format), embed that within the TEI and be done.  Since SVG is vector based, it has the nice side-effect of scaling very nicely to any output size, whether it is for printing through a PDF, or display on a monitor through HTML.

Really, Robert, this one wouldn't have been a show-stopper for DP even without TEI.

Josh
From nwolcott at dsdial.net  Thu Apr 28 07:21:59 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Thu Apr 28 07:45:46 2005
Subject: [gutvol-d] info on clearances
Message-ID: <006b01c54c00$f1bf8a60$159495ce@gw98>

Is thee any central place to see the results of a copyright clearance request or do they just go into a hole somewhere? 
N Wolcott  nwolcott2@post.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050428/290e250c/attachment.html
From grythumn at gmail.com  Thu Apr 28 08:03:32 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Thu Apr 28 08:03:39 2005
Subject: [gutvol-d] info on clearances
In-Reply-To: <006b01c54c00$f1bf8a60$159495ce@gw98>
References: <006b01c54c00$f1bf8a60$159495ce@gw98>
Message-ID: <15cfa2a50504280803681b16d7@mail.gmail.com>

You can look at the status of any of your clearances at
http://copy.pglaf.org , the same place you submit them. If you wish to
see what others have cleared, David Price's list (
http://www.dprice48.freeserve.co.uk/GutIP.html ) is the only complete
option, AFAIK. There are some lists of clearances from the big content
providers at DP in the forums there.

R C

On 4/28/05, N Wolcott <nwolcott@dsdial.net> wrote:
> Is thee any central place to see the results of a copyright clearance
> request or do they just go into a hole somewhere? 
> N Wolcott  nwolcott2@post.harvard.edu
From geoff.horton at gmail.com  Thu Apr 28 08:03:51 2005
From: geoff.horton at gmail.com (Geoff Horton)
Date: Thu Apr 28 08:03:59 2005
Subject: [gutvol-d] info on clearances
In-Reply-To: <006b01c54c00$f1bf8a60$159495ce@gw98>
References: <006b01c54c00$f1bf8a60$159495ce@gw98>
Message-ID: <94e5f59605042808035e45abf9@mail.gmail.com>

> Is thee any central place to see the results of a copyright clearance
> request or do they just go into a hole somewhere? 

Yours or someone else's?

Geoff
From lee at novomail.net  Thu Apr 28 08:35:26 2005
From: lee at novomail.net (Lee Passey)
Date: Thu Apr 28 08:39:49 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS (gutvol-d Digest, Vol 9, Issue 23)
In-Reply-To: <20050427001709.324168C8F9@pglaf.org>
References: <20050427001709.324168C8F9@pglaf.org>
Message-ID: <427102BE.5040001@novomail.net>

Marcello Perathoner wrote:

[restatement snipped]

> It is not! (People should really read the specs!) If it were #PCDATA 
> (*Parsed* Character DATA) the validator would find the error because 
> it would parse it. Its because its CDATA that the validator doesn't 
> parse it and in consequence cannot find the "error". In CDATA < has no 
> special meaning at all.


True. In my original analysis I was looking at the DTD declaration for 
file number 15701 which is XHTML Strict as opposed to file number 15698 
which is indeed HTML 4.01 Transitional. But in this case the distinction 
between #PCDATA and CDATA is inconsequential, because for this bug to 
manifest itself we are dealing with a User Agent which either does not 
support the <style> element, or has implemented that support incorrectly 
(as in the case of Opera 7.11). Thus, my analysis, and your restatement, 
are still correct.

Hopefully, this little quibble will not distract people from the larger 
issue, which is that the file at PG is flawed, and needs to be corrected.


David A. Desrosiers wrote:

>	The balance has to be consired: Is it worth it to incur 
>another socket connection and round trip from the client (more server 
>hits to retrieve resources) or is it better to have a larger 
>byte-per-hit (but less socket connections) per-client?
>  
>

This objection, and Marcello's response, are both based on a faulty 
assumption: that the primary use of HTML files will be online, served up 
by some sort of HTTP server. I would  bet that the vast majority of all 
HTML files offered by Project Gutenberg are downloaded to a local 
computer, and then read while offline. I know that _I_ have never read a 
PG e-text directly from the web server. Fetching multiple files from the 
local file system is simply a non-issue.  And if I want to substitute my 
own styles for those of the original poster, downloading to a local file 
system is required. If you're going to insist that user selection of 
styles is not possible, the question of whether they are internal styles 
or external styles is irrelevant, but if you're going to permit user 
flexibility in selecting styles, the issue of multiple GETs is irrelvant.

This is one of the reasons that it is important that all HTML files be 
offered in ZIP format: because it is likely that a user will be reading 
offline, s/he should be able to fetch all the files associated with a 
work (styles and images) in a single download.

From marcello at perathoner.de  Thu Apr 28 11:16:36 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Apr 28 11:16:48 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS (gutvol-d Digest, Vol 9, Issue 23)
In-Reply-To: <427102BE.5040001@novomail.net>
References: <20050427001709.324168C8F9@pglaf.org>
	<427102BE.5040001@novomail.net>
Message-ID: <42712884.7080305@perathoner.de>

Lee Passey wrote:


> Thus, my analysis, and your restatement, 
> are still correct.

First you cited the wrong spec and then you got the semantics of PCDATA 
and CDATA the wrong way. In the end you came out right, but that doesn't 
make it a correct analysis.


> This objection, and Marcello's response, are both based on a faulty 
> assumption: that the primary use of HTML files will be online, served up 
> by some sort of HTTP server. I would  bet that the vast majority of all 
> HTML files offered by Project Gutenberg are downloaded to a local 
> computer, and then read while offline.

But how does the file get to the local harddrive in the first place? 
Maybe you don't realize that the PG website is serving an average of 
300.000 file requests for a total of 130 GB a day. That is 12 Mbit/s, or 
8 T1 lines under full steam all day long.

In our case it is very important not to constipate the pipes with all 
those packets needed to open and close a connection which carry no 
useful data.


> And if I want to substitute my 
> own styles for those of the original poster, downloading to a local file 
> system is required.

Not at all. Your better browser (Mozilla) will let you define you own 
user stylesheets and even switch between multiple author stylesheets. 
(Thats what the C in CSS stands for.)


> If you're going to insist that user selection of 
> styles is not possible, the question of whether they are internal styles 
> or external styles is irrelevant, but if you're going to permit user 
> flexibility in selecting styles, the issue of multiple GETs is irrelvant.

Go, read the specs: you can have multiple stylesheets inside one 
document. Open this file in your better browser (Firefox). Then select: 
View | Page Style


<HEAD>
  <STYLE type="text/css" title="red">
    P { color: red }
  </STYLE>
  <STYLE type="text/css" title="blue">
    P { color: blue }
  </STYLE>
  <STYLE type="text/css" title="green">
    P { color: green }
  </STYLE>
</HEAD>
<BODY>
    <P>Don't show this to any DP Project Manager.
</BODY>


-- 
Marcello Perathoner
webmaster@gutenberg.org

From jon at noring.name  Thu Apr 28 11:35:07 2005
From: jon at noring.name (Jon Noring)
Date: Thu Apr 28 11:35:16 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS (gutvol-d Digest, Vol 9, Issue 23)
In-Reply-To: <42712884.7080305@perathoner.de>
References: <20050427001709.324168C8F9@pglaf.org>
	<427102BE.5040001@novomail.net> <42712884.7080305@perathoner.de>
Message-ID: <802820895.20050428123507@noring.name>

Marcello wrote:
> Lee Passey wrote:

>> This objection, and Marcello's response, are both based on a faulty
>> assumption: that the primary use of HTML files will be online, served up
>> by some sort of HTTP server. I would  bet that the vast majority of all
>> HTML files offered by Project Gutenberg are downloaded to a local 
>> computer, and then read while offline.

> But how does the file get to the local harddrive in the first place?
> Maybe you don't realize that the PG website is serving an average of
> 300.000 file requests for a total of 130 GB a day. That is 12 Mbit/s, or
> 8 T1 lines under full steam all day long.
>
> In our case it is very important not to constipate the pipes with all
> those packets needed to open and close a connection which carry no 
> useful data.

Lee did mention that the HTML version of a PG book be put into a
downloadable zip file. This should lead to some reduction in "pipe
flow". Otherwise, whenever someone accesses the online version, they
may do so multiple times. Also, zip may help some with improved
compression, and of course encapsulate multiple files into one.

At least that's how I see it. How much savings this gives, if any, I
can't estimate.

On another matter, how Firefox handles multiple <style>'s is way cool.
Is there some mechanism using Firefox to also select one among several
supplied *external* stylesheets?

Jon

From joshua at hutchinson.net  Thu Apr 28 11:43:13 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Thu Apr 28 11:43:21 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML
	andCSS (gutvol-d Di
Message-ID: <20050428184313.750522FAFA@ws6-3.us4.outblaze.com>


----- Original Message -----
From: "Marcello Perathoner" <marcello@perathoner.de>
> 
> <HEAD>
>   <STYLE type="text/css" title="red">
>     P { color: red }
>   </STYLE>
>   <STYLE type="text/css" title="blue">
>     P { color: blue }
>   </STYLE>
>   <STYLE type="text/css" title="green">
>     P { color: green }
>   </STYLE>
> </HEAD>
> <BODY>
>     <P>Don't show this to any DP Project Manager.
> </BODY>
> 

Too late.  We know all about that trick!  :)

The problem is that some browsers, whom shall remain nameless (*cough* IE *cough), go into "quirks" mode when there is multiple style sheets defined.  Things that worked fine with just one style sheet defined now quit working correctly with multiple style sheets defined.  Nevermind the fact that IE refuses to let you switch styles on the fly.

Now, IE *shouldn't* switch to quirks mode base on multiple styles, but it seems to anyways (at least in the limited testing I did on the subject).

It is also extremely annoying that IE triggers quirks mode if you include an XML prolog at the beginning. <?xml version="1.0" encoding="utf-8"?>

(BTW, Marcello, the TEI conversion currently puts that prolog in ... which triggers IE to quirks mode DESPITE the Strict statement in the next line.  We should see if we can safely remove it.)

See http://www.quirksmode.org/css/quirksmode.html for a quick and dirty primer on quirks mode and why it exists.

Josh
From joshua at hutchinson.net  Thu Apr 28 11:47:33 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Thu Apr 28 11:47:41 2005
Subject: [gutvol-d] More than you ever wanted to know about
	XHTML andCSS (gutv
Message-ID: <20050428184733.933272F9CD@ws6-3.us4.outblaze.com>


----- Original Message -----
From: "Jon Noring" <jon@noring.name>
> 
> On another matter, how Firefox handles multiple <style>'s is way cool.
> Is there some mechanism using Firefox to also select one among several
> supplied *external* stylesheets?
> 

Yes!  You can define a virtually unlimited number of styles, one after another, in the beginning of your HTML file.  Firefox will default to the first one, but you can switch on the fly under the View -> Page Style menu.

Just to make it more frustrating, IE picks the LAST style out of the list to display.  *rolls eyes*

Disclaimer: I may have the first and last backwards.  Firefox may pick the last and IE the first.  I just know they never AGREE on which one to display by default.

Josh
From jon at noring.name  Thu Apr 28 12:16:08 2005
From: jon at noring.name (Jon Noring)
Date: Thu Apr 28 12:16:19 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML
	andCSS (gutvol-d Di
In-Reply-To: <20050428184313.750522FAFA@ws6-3.us4.outblaze.com>
References: <20050428184313.750522FAFA@ws6-3.us4.outblaze.com>
Message-ID: <976451553.20050428131608@noring.name>

Joshua wrote:
> Marcello Perathoner wrote:

>> <HEAD>
>>   <STYLE type="text/css" title="red">
>>     P { color: red }
>>   </STYLE>
>>   <STYLE type="text/css" title="blue">
>>     P { color: blue }
>>   </STYLE>
>>   <STYLE type="text/css" title="green">
>>     P { color: green }
>>   </STYLE>
>> </HEAD>
>> <BODY>
>>     <P>Don't show this to any DP Project Manager.
>> </BODY>

> It is also extremely annoying that IE triggers quirks mode if you
> include an XML prolog at the beginning. <?xml version="1.0"
> encoding="utf-8"?>

Thanks for the link to quirksmode.org. I was aware of the quirks/
strict mode switches of browsers (based mostly on DOCTYPE), but didn't
realize that IE6 messed things up vis-a-vis the XML prolog:

   "In Explorer 6 Windows, Microsoft implemented one extra rule: if a
   doctype that triggers strict mode is preceded by an xml prolog, the
   page shows in quirks mode. This was done to allow web developers to
   achieve valid pages (which require a doctype) but nonetheless stay
   in quirks mode."

This is very lame. Microsoft simply assumed that no one will ever
include the XML prolog in a finished online HTML page, so they invoked
this rule for testing purposes! (There are certainly other ways they
could have used to force quirks mode, like a "always quirks mode"
menu selection.)

Hopefully the announced IE upgrade (supposedly to me more CSS
standards conformant -- a result of the pressure from Firefox and
Opera) will fix this. But I'm not holding my breath.

Jon Noring


From marcello at perathoner.de  Thu Apr 28 12:16:55 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Apr 28 12:17:05 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and
	CSS (gutvol-d Digest, Vol 9, Issue 23)
In-Reply-To: <802820895.20050428123507@noring.name>
References: <20050427001709.324168C8F9@pglaf.org>	<427102BE.5040001@novomail.net>
	<42712884.7080305@perathoner.de>
	<802820895.20050428123507@noring.name>
Message-ID: <427136A7.5090108@perathoner.de>

Jon Noring wrote:

> Lee did mention that the HTML version of a PG book be put into a
> downloadable zip file.

I see. He invented warm water. We were doing that since day 1.


> On another matter, how Firefox handles multiple <style>'s is way cool.
> Is there some mechanism using Firefox to also select one among several
> supplied *external* stylesheets?

Just the same. Use multiple <link>s and give them different titles.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From Bowerbird at aol.com  Thu Apr 28 13:46:49 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Apr 28 13:47:02 2005
Subject: [gutvol-d] re: this, that, and the other thing
Message-ID: <d8.2527948b.2fa2a5b9@aol.com>

josh said:
>   *rolls eyes*
>   Disclaimer: I may have the first and last backwards.  
>   Firefox may pick the last and IE the first.  
>   I just know they never AGREE on which one to display by default.

i  would imagine the spec specifies, but if it doesn't, then
it's not obvious whether to pick the first or the last, is it?

so why roll eyes at one way or the other?

anywho, it doesn't really matter either way, since you can
make the last a duplicate of the first if you have several...

meanwhile, even though the experts have had their conclave,
and discussed this,that, and the other thing, and then retired
to their corners, these .html files _still_ don't work for me...

-bowerbird
From prosfilaes at gmail.com  Thu Apr 28 19:53:59 2005
From: prosfilaes at gmail.com (David Starner)
Date: Thu Apr 28 19:54:14 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML and CSS
	(gutvol-d Digest, Vol 9, Issue 23)
In-Reply-To: <427102BE.5040001@novomail.net>
References: <20050427001709.324168C8F9@pglaf.org>
	<427102BE.5040001@novomail.net>
Message-ID: <6d99d1fd05042819537173ce58@mail.gmail.com>

On 4/28/05, Lee Passey <lee@novomail.net> wrote:
> This objection, and Marcello's response, are both based on a faulty
> assumption: that the primary use of HTML files will be online, served up
> by some sort of HTTP server. I would  bet that the vast majority of all
> HTML files offered by Project Gutenberg are downloaded to a local
> computer, and then read while offline. I know that _I_ have never read a
> PG e-text directly from the web server. 

But you don't know--you can't know--that it's a faulty assumption. You
know how you do it. No one reading it via a library terminal downloads
it, though. I usually read books online.
From jmdyck at ibiblio.org  Thu Apr 28 22:21:57 2005
From: jmdyck at ibiblio.org (Michael Dyck)
Date: Thu Apr 28 22:26:30 2005
Subject: [gutvol-d] More than you ever wanted to know about	XHTMLandCSS
	(gutv
References: <20050428184733.933272F9CD@ws6-3.us4.outblaze.com>
Message-ID: <4271C475.2317626B@ibiblio.org>

Joshua Hutchinson wrote:
> 
> You can define a virtually unlimited number of styles, one
> after another, in the beginning of your HTML file.  Firefox will
> default to the first one, but you can switch on the fly under the
> View -> Page Style menu.
> 
> Just to make it more frustrating, IE picks the LAST style out of
> the list to display.  *rolls eyes*
> 
> Disclaimer: I may have the first and last backwards.  Firefox may
> pick the last and IE the first.  I just know they never AGREE on
> which one to display by default.

The HTML 4 Rec, at
    http://www.w3.org/TR/html4/present/styles.html#h-14.3.1
talks about multiple mutually exclusive style sheets, of which one
can be 'preferred'. (Browsers should apply the preferred style sheet,
if the user hasn't selected a different one.) However, the syntax by
which the author specifies the preferred style sheet appears to only
apply to *external* style sheets, referenced via <LINK> elements.

Marcello Perathoner wrote:
> 
> <HEAD>
>   <STYLE type="text/css" title="red">
>     P { color: red }
>   </STYLE>
>   <STYLE type="text/css" title="blue">
>     P { color: blue }
>   </STYLE>
>   <STYLE type="text/css" title="green">
>     P { color: green }
>   </STYLE>
> </HEAD>
> <BODY>
>     <P>Don't show this to any DP Project Manager.
> </BODY>
> 

For *internal* style sheets, defined within <STYLE> elements as above,
it seems to me that the Rec is silent on the meaning/significance of
attaching a 'title' attribute to a STYLE element. And even more silent
on the significance of multiple such elements. (Or how they should
interact with external style sheets.) As far as I can tell, this also
applies to versions of XHTML up to 2.0.

-Michael Dyck

From ke at gnu.franken.de  Thu Apr 28 22:30:01 2005
From: ke at gnu.franken.de (Karl Eichwalder)
Date: Fri Apr 29 07:39:15 2005
Subject: [gutvol-d] More than you ever wanted to know about XHTML
	andCSS (gutv
In-Reply-To: <20050428184733.933272F9CD@ws6-3.us4.outblaze.com> (Joshua
	Hutchinson's message of "Thu, 28 Apr 2005 13:47:33 -0500")
References: <20050428184733.933272F9CD@ws6-3.us4.outblaze.com>
Message-ID: <shbr7yrvza.fsf@tux.gnu.franken.de>

"Joshua Hutchinson" <joshua@hutchinson.net> writes:

> Disclaimer: I may have the first and last backwards.  Firefox may pick
> the last and IE the first.  I just know they never AGREE on which one to
> display by default.

IIRC, the 'rel' attribute define the default stylesheet; additional
stylesheets should set it to "alternate stylesheet":

<link rel="stylesheet" title="Compact" href="bell.css" type="text/css">
<link rel="alternate stylesheet" title="With Page Markers and Meta Info"
      href="bell-pb.css" type="text/css">

-- 
http://www.gnu.franken.de/ke/                           |      ,__o
                                                        |    _-\_<,
                                                        |   (*)/'(*)
Key fingerprint = F138 B28F B7ED E0AC 1AB4  AA7F C90A 35C3 E9D0 5D1C
From lee at novomail.net  Fri Apr 29 12:43:06 2005
From: lee at novomail.net (Lee Passey)
Date: Fri Apr 29 12:44:33 2005
Subject: [gutvol-d] Re: More than you ever wanted to know about XHTML and
 CSS (gutvol-d Digest, Vol 9, Issue 26)
In-Reply-To: <20050428190004.9938B8C92B@pglaf.org>
References: <20050428190004.9938B8C92B@pglaf.org>
Message-ID: <42728E4A.1020203@novomail.net>

Marcello Perathoner wrote:

> In the end you came out right,

Precisely.

Hopefully, this little quibble will not distract people from the larger 
issues, which is that the file at PG is still broken and needs to be 
corrected, and that the W3C's validator does not necessarily guarantee a 
fully conformant file.