From marcello at perathoner.de  Wed Jun  1 11:32:49 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Wed Jun  1 11:32:59 2005
Subject: [gutvol-d] WIPO Online Forum on Intellectual Property in the
	Information Society
Message-ID: <429DFF51.7080007@perathoner.de>

Welcome to the Online Forum on Intellectual Property in the Information 
Society, hosted by the World Intellectual Property Organization (WIPO) 
from June 1 to 15, 2005.
		
The WIPO Online Forum is designed to enable and encourage an open debate 
on issues related to intellectual property in the information society, 
and in light of the goals of the World Summit on the Information Society 
(WSIS).  This presents a unique opportunity for all to engage in the 
emerging debate on intellectual property in our day.

The 10 themes for discussion are listed below - scroll down to select a 
theme.

The WIPO Online Forum is open to participation by all interested persons 
? you are invited to join in online discussions over a period of two 
weeks from June 1, 2005. It is hoped that the Online Forum will further 
inform the discussions taking place during the second phase of WSIS. 
The conclusions of the Online Forum will form part of WIPO?s 
contribution to the WSIS Tunis Summit.


   http://www.wipo.int/ipisforum/en/


-- 
Marcello Perathoner
webmaster@gutenberg.org

From servalan at ar.com.au  Thu Jun  2 19:10:27 2005
From: servalan at ar.com.au (Pauline)
Date: Thu Jun  2 19:11:15 2005
Subject: [gutvol-d] DP is back up
In-Reply-To: <429C1CE9.7030206@ar.com.au>
References: <20050531072555.GA20636@pglaf.org> <429C1CE9.7030206@ar.com.au>
Message-ID: <429FBC13.9050503@ar.com.au>

Hi All,

DP is back up now. Come & have a look at our new site.
http://www.pgdp.net

Thanks for your patience,
P
-- 
Help digitise public domain books:
Distributed Proofreaders: http://www.pgdp.net
"Preserving history one page at a time."

Set free dead-tree books:
http://bookcrossing.com/referral/servalan
From kouhia at nic.funet.fi  Mon Jun  6 10:18:24 2005
From: kouhia at nic.funet.fi (Juhana Sadeharju)
Date: Mon Jun  6 10:18:35 2005
Subject: [gutvol-d] Re: WIPO Online Forum on Intellectual Property
Message-ID: <S11194AbVFFRSY/20050606171824Z+3565@nic.funet.fi>

>From: Marcello Perathoner <
>
>Welcome to the Online Forum on Intellectual Property in the Information 
>Society, hosted by the World Intellectual Property Organization (WIPO) 
>from June 1 to 15, 2005.

What is the aim of this project?

I suggest the copyright period would be changed so that each book
has fixed 50 years copyright protection. Would this suggestion
be seriously considered in WIPO? Or does the Disn... money talk.

How about my suggestion on making anything patentable without costs
so that we who develop free software could protect our intellectual
property as well? All disagreements on IPs would be settled in the
courts with money. Patent offices would not spend money in examining
the patents. Now patent system discriminates us who don't take
money from our products.

Whos intellectual property WIPO is after? Who or what companies
are behind the WIPO?

Juhana
-- 
  http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
  for developers of open source graphics software
From marcello at perathoner.de  Mon Jun  6 11:01:03 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Mon Jun  6 11:01:13 2005
Subject: [gutvol-d] Re: WIPO Online Forum on Intellectual Property
In-Reply-To: <S11194AbVFFRSY/20050606171824Z+3565@nic.funet.fi>
References: <S11194AbVFFRSY/20050606171824Z+3565@nic.funet.fi>
Message-ID: <42A48F5F.5040708@perathoner.de>

Juhana Sadeharju wrote:
>>From: Marcello Perathoner <
>>
>>Welcome to the Online Forum on Intellectual Property in the Information 
>>Society, hosted by the World Intellectual Property Organization (WIPO) 
> 
>>from June 1 to 15, 2005.
> 
> What is the aim of this project?
> 
> I suggest the copyright period would be changed so that each book
> has fixed 50 years copyright protection. Would this suggestion
> be seriously considered in WIPO? Or does the Disn... money talk.
> 
> How about my suggestion on making anything patentable without costs
> so that we who develop free software could protect our intellectual
> property as well? All disagreements on IPs would be settled in the
> courts with money. Patent offices would not spend money in examining
> the patents. Now patent system discriminates us who don't take
> money from our products.
> 
> Whos intellectual property WIPO is after? Who or what companies
> are behind the WIPO?

WIPO stands for World Intellectual Property Organisation

Very basically its a treaty governing international patent and copyright 
issues.

More information at: www.wipo.int


We at PG should comment about the detrimental effects of overly long 
copyrights on culture and education.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From sly at victoria.tc.ca  Mon Jun  6 18:29:02 2005
From: sly at victoria.tc.ca (Andrew Sly)
Date: Mon Jun  6 18:29:18 2005
Subject: [gutvol-d] Re: WIPO Online Forum on Intellectual Property
In-Reply-To: <S11194AbVFFRSY/20050606171824Z+3565@nic.funet.fi>
References: <S11194AbVFFRSY/20050606171824Z+3565@nic.funet.fi>
Message-ID: <Pine.GSO.4.58.0506061809280.9055@vtn1.victoria.tc.ca>


On Mon, 6 Jun 2005, Juhana Sadeharju wrote:

>
> I suggest the copyright period would be changed so that each book
> has fixed 50 years copyright protection. Would this suggestion
> be seriously considered in WIPO? Or does the Disn... money talk.
>

I believe you would need to go back in copyright history
a little bit. I believe the basis of terms etc. under WIPO
is based on the Berne convention.

This convention (first formulated in 1886) is the most
wide-spread international copyright agreement.

It sets out a basic minimum copyright term of life+50.
The US avoided signing onto this treaty until near the
end of the twentieth century. Unfortunately, they along
a few other countries, have enacted laws which grant a
copyright longer than the minimum.

I would suggest that at this point in time, attempts to
change the minimum term enacted in the Berne convention
would be useless. If possible, it might be good to
encourage National laws to stay with that minimum--to
present countries which do so as progressive.

(Some people will argue the opposite--that countries with
a life+50 term are backwards, behind the times, and should
"catch up" with the U.S., the U.K., et al.)

Andrew
From webmaster at gutenberg.org  Wed Jun  8 13:29:07 2005
From: webmaster at gutenberg.org (Marcello Perathoner)
Date: Wed Jun  8 13:29:21 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
Message-ID: <42A75513.1080508@gutenberg.org>

Anybody want to answer this one?


-------- Original Message --------
Subject: Ebook Reading device?
Date: Wed, 08 Jun 2005 21:11:35 +0100
From: Robert Sutherland <robsuth@robsuth.plus.com>
To: webmaster@gutenberg.org


Being now in retirement I lately became interested in E-books and was
delighted - amazed, more like! - to discover Project Gutenberg. However, I
have been very puzzled by the apparent absence of a simple portable device
designed for reading downloaded e.books. All my searches on the internet
and my inquiries of the trade have failed to trace one. I wonder if you can
put me on track of one?

The trade just assume that a lap-top or a PDA would be quite adequate, but
neither is really suitable. I use a lap-top mostly but they are far bigger
than is required, and are far from being as portable as I am sure a
specific device could be. I have not found a PDA with a large enough screen
to provide comfortable reading - indeed, even to take the kind of
line-length used in PG, or if they do it would excessively reduce the print
size, which begins to matter as one gets older. To anyone making any
considerable use of e.books a specific device designed for the purpose
would be a distinct asset.

As far as I see from the internet, there used to be a few such devices
available but they seem to have been dedicated to special file formats used
exclusively by firms producing e.books for sale: the indications seem to be
that their efforts to establish monopolies mostly failed and their devices
ceased to be available in the market. Some at least were exclusive to USA
anyway, which would not have helped someone like myself resident in UK.

I raised this matter with one of the main UK computing magazines but they
came back only with the standard view that a PDA would do, which of course
it would not, being designed for quite different purposes. I have also
enquired of several of the main computing retailers, none of whom has shown
the slightest interest.

I feel quite surprised that nothing specific is available - have I missed
something in my researches? If I have, I'd be very grateful if you could
point me in the right direction; but if I have not, then could PG perhaps
set a spark to some manufacturer's imagination?

I thought that perhaps a modern DVD portable player might be the answer -
some very cheap models are becoming available - but from the specifications
I have seen and the advice given by retailers they are unlikely to be able
to take .txt, .rtf or .pdf files. If they did, one could simply put the
e.books onto CD or DVD as data files - although slightly bigger than
Captain Picard uses when at leisure in his quarters, a portable DVD player
would be much more convenient to use than a laptop. I am currently trying
to ascertain whether it might be possible to charge an existing model with
a program to make it compatible? One just needs .txt, .rtf and .pdf.

Yours sincerely,
Robert Sutherland
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


-- 
Marcello Perathoner
webmaster@gutenberg.org

From joshua at hutchinson.net  Wed Jun  8 13:36:09 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Wed Jun  8 13:36:18 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
Message-ID: <20050608203609.2EC9A9E792@ws6-2.us4.outblaze.com>

Robert,

You might be more pleased with the results on a PDA if you try an HTML edition of the book as opposed to the Text versions.  In my experience, the simplified web browser in most PDAs is quite up to the task of formatting the text to nicely fit on a PDA screen.  It is the hard return marks in the text files that cause the line length issues you've seen.

NOTE: This doesn't help with those texts that don't have an HTML edition available, I realize.  There are others on the list that may be better suited to answer about dedicated eBook readers (which I have heard of, but have no direct experience with).

Josh


----- Original Message -----
From: "Marcello Perathoner" <webmaster@gutenberg.org>
To: "Project Gutenberg volunteer discussion" <gutvol-d@lists.pglaf.org>
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
Date: Wed, 08 Jun 2005 22:29:07 +0200

> 
> Anybody want to answer this one?
> 
> 
> -------- Original Message --------
> Subject: Ebook Reading device?
> Date: Wed, 08 Jun 2005 21:11:35 +0100
> From: Robert Sutherland <robsuth@robsuth.plus.com>
> To: webmaster@gutenberg.org
> 
> 
> 
> Being now in retirement I lately became interested in E-books and was
> delighted - amazed, more like! - to discover Project Gutenberg. However, I
> have been very puzzled by the apparent absence of a simple portable device
> designed for reading downloaded e.books. All my searches on the internet
> and my inquiries of the trade have failed to trace one. I wonder if you can
> put me on track of one?
> 
> The trade just assume that a lap-top or a PDA would be quite adequate, but
> neither is really suitable. I use a lap-top mostly but they are far bigger
> than is required, and are far from being as portable as I am sure a
> specific device could be. I have not found a PDA with a large enough screen
> to provide comfortable reading - indeed, even to take the kind of
> line-length used in PG, or if they do it would excessively reduce the print
> size, which begins to matter as one gets older. To anyone making any
> considerable use of e.books a specific device designed for the purpose
> would be a distinct asset.
> 
> As far as I see from the internet, there used to be a few such devices
> available but they seem to have been dedicated to special file formats used
> exclusively by firms producing e.books for sale: the indications seem to be
> that their efforts to establish monopolies mostly failed and their devices
> ceased to be available in the market. Some at least were exclusive to USA
> anyway, which would not have helped someone like myself resident in UK.
> 
> I raised this matter with one of the main UK computing magazines but they
> came back only with the standard view that a PDA would do, which of course
> it would not, being designed for quite different purposes. I have also
> enquired of several of the main computing retailers, none of whom has shown
> the slightest interest.
> 
> I feel quite surprised that nothing specific is available - have I missed
> something in my researches? If I have, I'd be very grateful if you could
> point me in the right direction; but if I have not, then could PG perhaps
> set a spark to some manufacturer's imagination?
> 
> I thought that perhaps a modern DVD portable player might be the answer -
> some very cheap models are becoming available - but from the specifications
> I have seen and the advice given by retailers they are unlikely to be able
> to take .txt, .rtf or .pdf files. If they did, one could simply put the
> e.books onto CD or DVD as data files - although slightly bigger than
> Captain Picard uses when at leisure in his quarters, a portable DVD player
> would be much more convenient to use than a laptop. I am currently trying
> to ascertain whether it might be possible to charge an existing model with
> a program to make it compatible? One just needs .txt, .rtf and .pdf.
> 
> Yours sincerely,
> Robert Sutherland
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> 
> 
> -- Marcello Perathoner
> webmaster@gutenberg.org
> 
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d

From hart at pglaf.org  Wed Jun  8 13:44:24 2005
From: hart at pglaf.org (Michael Hart)
Date: Wed Jun  8 13:44:25 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
In-Reply-To: <20050608203609.2EC9A9E792@ws6-2.us4.outblaze.com>
References: <20050608203609.2EC9A9E792@ws6-2.us4.outblaze.com>
Message-ID: <Pine.LNX.4.60.0506081343300.32186@pglaf.org>


Palmreader and a number of other programs seem to have functions
that can do at least some of what you need, not to mention the
simple stripping of hard returns you can do before loading.

mh

On Wed, 8 Jun 2005, Joshua Hutchinson wrote:

> Robert,
>
> You might be more pleased with the results on a PDA if you try an HTML edition of the book as opposed to the Text versions.  In my experience, the simplified web browser in most PDAs is quite up to the task of formatting the text to nicely fit on a PDA screen.  It is the hard return marks in the text files that cause the line length issues you've seen.
>
> NOTE: This doesn't help with those texts that don't have an HTML edition available, I realize.  There are others on the list that may be better suited to answer about dedicated eBook readers (which I have heard of, but have no direct experience with).
>
> Josh
>
>
> ----- Original Message -----
> From: "Marcello Perathoner" <webmaster@gutenberg.org>
> To: "Project Gutenberg volunteer discussion" <gutvol-d@lists.pglaf.org>
> Subject: [gutvol-d] [Fwd: Ebook Reading device?]
> Date: Wed, 08 Jun 2005 22:29:07 +0200
>
>>
>> Anybody want to answer this one?
>>
>>
>> -------- Original Message --------
>> Subject: Ebook Reading device?
>> Date: Wed, 08 Jun 2005 21:11:35 +0100
>> From: Robert Sutherland <robsuth@robsuth.plus.com>
>> To: webmaster@gutenberg.org
>>
>>
>>
>> Being now in retirement I lately became interested in E-books and was
>> delighted - amazed, more like! - to discover Project Gutenberg. However, I
>> have been very puzzled by the apparent absence of a simple portable device
>> designed for reading downloaded e.books. All my searches on the internet
>> and my inquiries of the trade have failed to trace one. I wonder if you can
>> put me on track of one?
>>
>> The trade just assume that a lap-top or a PDA would be quite adequate, but
>> neither is really suitable. I use a lap-top mostly but they are far bigger
>> than is required, and are far from being as portable as I am sure a
>> specific device could be. I have not found a PDA with a large enough screen
>> to provide comfortable reading - indeed, even to take the kind of
>> line-length used in PG, or if they do it would excessively reduce the print
>> size, which begins to matter as one gets older. To anyone making any
>> considerable use of e.books a specific device designed for the purpose
>> would be a distinct asset.
>>
>> As far as I see from the internet, there used to be a few such devices
>> available but they seem to have been dedicated to special file formats used
>> exclusively by firms producing e.books for sale: the indications seem to be
>> that their efforts to establish monopolies mostly failed and their devices
>> ceased to be available in the market. Some at least were exclusive to USA
>> anyway, which would not have helped someone like myself resident in UK.
>>
>> I raised this matter with one of the main UK computing magazines but they
>> came back only with the standard view that a PDA would do, which of course
>> it would not, being designed for quite different purposes. I have also
>> enquired of several of the main computing retailers, none of whom has shown
>> the slightest interest.
>>
>> I feel quite surprised that nothing specific is available - have I missed
>> something in my researches? If I have, I'd be very grateful if you could
>> point me in the right direction; but if I have not, then could PG perhaps
>> set a spark to some manufacturer's imagination?
>>
>> I thought that perhaps a modern DVD portable player might be the answer -
>> some very cheap models are becoming available - but from the specifications
>> I have seen and the advice given by retailers they are unlikely to be able
>> to take .txt, .rtf or .pdf files. If they did, one could simply put the
>> e.books onto CD or DVD as data files - although slightly bigger than
>> Captain Picard uses when at leisure in his quarters, a portable DVD player
>> would be much more convenient to use than a laptop. I am currently trying
>> to ascertain whether it might be possible to charge an existing model with
>> a program to make it compatible? One just needs .txt, .rtf and .pdf.
>>
>> Yours sincerely,
>> Robert Sutherland
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>>
>>
>>
>> -- Marcello Perathoner
>> webmaster@gutenberg.org
>>
>> _______________________________________________
>> gutvol-d mailing list
>> gutvol-d@lists.pglaf.org
>> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
From jon_niehof at yahoo.com  Wed Jun  8 14:10:59 2005
From: jon_niehof at yahoo.com (Jon Niehof)
Date: Wed Jun  8 14:11:09 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
In-Reply-To: <20050608203609.2EC9A9E792@ws6-2.us4.outblaze.com>
Message-ID: <20050608211059.84786.qmail@web32904.mail.mud.yahoo.com>

Joshua Hutchinson wrote:
> You might be more pleased with the results on a PDA if you try
> an HTML edition of the book as opposed to the Text versions. 
> In my experience, the simplified web browser in most PDAs is
> quite up to the task of formatting the text to nicely fit on a
> PDA screen.  It is the hard return marks in the text files
> that cause the line length issues you've seen.

Whereas I take the opposite tack and use the plain text version
coupled with Weasel ( http://gutenpalm.sourceforge.net/ ). It
has an autoscroll mode that fills the screen from top to bottom
and then wraps around to start filling from the top again--so by
the time you reach the bottom of a screenful, the top has new
text. It rewraps the text for you (with a couple of options on
how to do it) so line length's not an issue.

I find reading on a computer much less convenient simply because
there isn't good software. With lighter laptops and especially
tablets the hardware side is less of an issue; the bulkiness of
a tablet per unit screen area probably isn't worse than a PDA or
DVD player.

There are workarounds for using portable DVD players. Most of
them play VCD's and one can make VCD's that are a sequence of
stills. I'm sure similar hacks are possible with DVD's as well,
and it's always possible to create a movie of scrolling text.
But the resolution wouldn't be much better than a modern PDA,
and there'd be a lot of work involved in setting up such a
system.

If you're looking for the most "booklike" solution, a tablet PC
is probably it. A PDA is the most cost-effective approach, which
gives a slightly different "feel" to reading but one that I find
just as enjoyable.

Good luck, and I hope you find something that works for you.


__________________________________ 
Discover Yahoo! 
Use Yahoo! to plan a weekend, have fun online and more. Check it out! 
http://discover.yahoo.com/
From collin at xs4all.nl  Wed Jun  8 15:33:58 2005
From: collin at xs4all.nl (Branko Collin)
Date: Wed Jun  8 15:20:34 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
In-Reply-To: <42A75513.1080508@gutenberg.org>
Message-ID: <42A78E76.21922.35DE58D@localhost>


On 8 Jun 2005, at 22:29, Marcello Perathoner wrote:

> Being now in retirement I lately became interested in E-books and was
> delighted - amazed, more like! - to discover Project Gutenberg.
> However, I have been very puzzled by the apparent absence of a simple
> portable device designed for reading downloaded e.books. All my
> searches on the internet and my inquiries of the trade have failed to
> trace one. I wonder if you can put me on track of one?

[snip]
 
> I raised this matter with one of the main UK computing magazines but
> they came back only with the standard view that a PDA would do, which
> of course it would not, being designed for quite different purposes. I
> have also enquired of several of the main computing retailers, none of
> whom has shown the slightest interest.
> 
> I feel quite surprised that nothing specific is available - have I
> missed something in my researches? 

I am afraid you haven't missed much.

There are a few devices that have been developed specifically for 
reading ebooks, notably the Sony Librie 
(<http://www.sony.jp/products/Consumer/LIBRIE/>) and the Ebookwise 
1150 (<http://www.ebookwise.com/servlet/mw?t=book&bi=27007&si=43>).

But as you noted: 

> As far as I see from the internet, there used to be a few 
> such devices available but they seem to have been 
> dedicated to special file formats used exclusively by 
> firms producing e.books for sale: the indications seem to 
> be that their efforts to establish monopolies mostly failed
> and their devices ceased to be available in the market. 

However, since you don't mind asking Project Gutenberg, which 
produces very raw and unadorned ebooks, you probably do not mind 
having to put in some extra work. Both the Librie and the Ebookwise 
can handle other formats once you have made a conversion step.

> I thought that perhaps a modern DVD portable player might be the
> answer - some very cheap models are becoming available - but from the
> specifications I have seen and the advice given by retailers they are
> unlikely to be able to take .txt, .rtf or .pdf files. If they did, one
> could simply put the e.books onto CD or DVD as data files - although
> slightly bigger than Captain Picard uses when at leisure in his
> quarters, a portable DVD player would be much more convenient to use
> than a laptop. I am currently trying to ascertain whether it might be
> possible to charge an existing model with a program to make it
> compatible? One just needs .txt, .rtf and .pdf.

A Play Station Portable may approach what you are looking for; I am 
not sure how well developed interfaces for DVD portables are.

There used to be a small computer somewhere halfway between a PDA and 
a notebook that sounded promising, with wireless ethernet, sub 1-kg 
weight, 7 inch screen (VGA), and 11 hours of battery life. It was 
called the Psion Netbook, and it was pretty much stillborn. But the 
folks at The Register liked it 
(<http://www.theregister.co.uk/2002/09/09/i_have_seen_the_future/>) 
and to me it always sounded like a good ebook reading device. 

Psion followed it up with the Netbook Pro, which is way too heavy. 

If I were you, I would focus on the device first, and only then look 
if there is conversion software available. 

-- 
branko collin
collin@xs4all.nl
From jon_niehof at yahoo.com  Wed Jun  8 15:44:59 2005
From: jon_niehof at yahoo.com (Jon Niehof)
Date: Wed Jun  8 15:45:11 2005
Subject: [gutvol-d] [Fwd: Ebook Reading device?]
In-Reply-To: <42A78E76.21922.35DE58D@localhost>
Message-ID: <20050608224459.60922.qmail@web32910.mail.mud.yahoo.com>

I apologize for hammering your inbox, Robert, but Branko has an
excellent idea:

--- Branko Collin <collin@xs4all.nl> wrote:
> A Play Station Portable may approach what you are looking for;

It's a bit pricey for *just* an ebook reader (not that a laptop
is cheap), but here are two resources:
http://gamefries.blogspot.com/2005/03/how-to-get-e-books-on-your-psp.html
http://pdf2psp.sourceforge.net/

It's a "batch of images" approach so you can't search or
anything, but it gets the job done. Most of the portable DVD
players listed on Amazon also offer JPEG or Kodak Photo CD
support, and you could probably use some of the same software as
used for the PSP.


> There used to be a small computer somewhere halfway between a
> PDA and a notebook that sounded promising, with wireless
> ethernet, sub 1-kg weight, 7 inch screen (VGA), and 11 hours
of
> battery life. It was called the Psion Netbook, and it was
> pretty much stillborn.

The Oqo is similar and is finally "available", at $2600. The
ThinkPad X41 would probably be a worthy competitor for
e-booking--4 lbs., but 12" screen and "only" $1900.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
From gbnewby at pglaf.org  Thu Jun  9 00:50:22 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Thu Jun  9 00:50:24 2005
Subject: [gutvol-d] New "draft" DVD image
Message-ID: <20050609075022.GA15457@pglaf.org>

There is a new DVD image some folks might like to check out.
You can get it interactively here:

	http://snowy.arsc.alaska.edu/pgjun05

or download the full ISO here (size=4668391424 bytes,
MD5sum=eb9d00a4b1e4cb30d801709ced6da282):
	ftp://snowy.arsc.alaska.edu/pub/gbn/pgjun05.iso

This is the first major output of Craig Stephenson's
program to allow people to build their own CD/DVD
ISOs.  I'll send a URL to the program in another week
or two (it's not quite ready yet for multiple users).

We started with the Best Of CD titles as core, getting
updated files with an emphasis on HTML.  Then, we blindly
added lots more HTML, uncompressed, for a pleasurable
"unzip-free" reading experience.  I also made sure a few
particular authors were included, in the Best Of tradition.

There are a few things I know are problematic, but please
inform me of any others that you spot:

- a few copyrighted files snuck in (some MP3 audio and a
Kafka)

- the author/title index files are mixed case, and would
be better in a subdirectory

- there might be some Complete volumes that are partially
duplicated by individual volumes.  If you spot any, let
me know

- the author/title index pages need something like a
"Link: " label for the eBook file, and also a "Language: "
field.  We might add a "by-language" index, in addition
to the Author and Title indexes.

Although I made a bunch of these for Michael Hart's visit
to Alaska (public talk=Wednesday June 22 at the Fairbanks
Public Library 7:00 pm), and to try to give away to AK libraries,
I don't expect this to be quite polished enough to redistribute
en masse.  But I hope it might be the core of a new DVD
option to supplement our "PG 10K Special" from December 2004.
(That DVD, which is eBook 11800, is mostly zipped .txt files --
about 9400 titles).

This new DVD image contains about 5100 eBooks.

In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL
database.  Then, PHP is used to provide a user with an iterative,
interactive set of steps to add and delete eBooks and their formats from
the ISO.

Building an online browsable prototype of the ISO is simple and fast,
because we use hard links (on the same filesystem as the collection
mirror).  Once it looks good, the actual ISO is built with mkisofs
(which takes a little while) and becomes available for download via FTP
(or HTTP if it's < 2GB).  We'll be doing features etc., and making the
code widely available (though it basically requires a complete PG mirror
to work).

Enjoy, and please send feedback!
  -- Greg
From marcello at perathoner.de  Thu Jun  9 02:38:11 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Jun  9 02:38:33 2005
Subject: [gutvol-d] New "draft" DVD image
In-Reply-To: <20050609075022.GA15457@pglaf.org>
References: <20050609075022.GA15457@pglaf.org>
Message-ID: <42A80E03.6010700@perathoner.de>

Greg Newby wrote:

> In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL
> database.  Then, PHP is used to provide a user with an iterative,
> interactive set of steps to add and delete eBooks and their formats from
> the ISO.

Who do we target, the PG DVD team or the user at large?

Where is this program supposed to run when it is ready?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From kouhia at nic.funet.fi  Thu Jun  9 11:34:45 2005
From: kouhia at nic.funet.fi (Juhana Sadeharju)
Date: Thu Jun  9 11:34:57 2005
Subject: [gutvol-d] Re: WIPO Online Forum on Intellectual Property
Message-ID: <S22750AbVFISep/20050609183445Z+5908@nic.funet.fi>

>From: Andrew Sly <sly@victoria.tc.ca>
>
>This [Berne] convention (first formulated in 1886) is the most
>wide-spread international copyright agreement.
>
>It sets out a basic minimum copyright term of life+50.

Maybe they were wrong then as well.
The term should decrease nowadays. The trend today
is to have the old material available. Nobody gains
if the old books are out-of-print.

But we can also blame the authors who gives their
soul... work for life+70+.

If Berne and equivalents cannot be changed, then authors
should sign only contracts which does not sell their soul.
Has anyone statistics how books does sell? How many years
the books sell with profit?

Have we asked permission to make out-of-print and still
copyrighted books available? That would save the publisher
the trouble.

Juhana 
-- 
  http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
  for developers of open source graphics software
From gbnewby at pglaf.org  Thu Jun  9 16:04:26 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Thu Jun  9 16:04:29 2005
Subject: [gutvol-d] New "draft" DVD image
In-Reply-To: <42A80E03.6010700@perathoner.de>
References: <20050609075022.GA15457@pglaf.org> <42A80E03.6010700@perathoner.de>
Message-ID: <20050609230426.GE1218@pglaf.org>

On Thu, Jun 09, 2005 at 11:38:11AM +0200, Marcello Perathoner wrote:
> Greg Newby wrote:
> 
> >In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL
> >database.  Then, PHP is used to provide a user with an iterative,
> >interactive set of steps to add and delete eBooks and their formats from
> >the ISO.
> 
> Who do we target, the PG DVD team or the user at large?

The user at large.  But there are benefits for the DVD team and other
purposes, as well.

For example, someone will be able to "save" their ISO configuration,
then return later to get *updated* files for the same eBooks.  This will
be particularly useful for doing things like quarterly updates of
"theme" CDs or DVDs, such as Col Choat's idea of an "explorers"
collection.

> Where is this program supposed to run when it is ready?

On a beefy server.  Right now it's on snowy.arsc.alaska.edu, and I
imagine snowy will be suitable for relatively large-scale use.  I hope
the program will be available at other mirror sites, too.  I think it
will be too intensive in disk & CPU for iBiblio, but you never know...
if this sounds computationally unrealistic to offer to the general
reader, to you, read my work .sig below :-)
  -- Greg


Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center
Univ of Alaska Fairbanks-909 Koyukuk Dr-PO Box 756020-Fairbanks-AK 99775-6020
e: newby AT arsc.edu v: 907-450-8663 f: 907-450-8601 w: www.arsc.edu/~newby

From marcello at perathoner.de  Fri Jun 10 02:54:53 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Jun 10 02:55:21 2005
Subject: [gutvol-d] New "draft" DVD image
In-Reply-To: <20050609230426.GE1218@pglaf.org>
References: <20050609075022.GA15457@pglaf.org> <42A80E03.6010700@perathoner.de>
	<20050609230426.GE1218@pglaf.org>
Message-ID: <42A9636D.5030301@perathoner.de>

Greg Newby wrote:

> For example, someone will be able to "save" their ISO configuration,
> then return later to get *updated* files for the same eBooks.  This will
> be particularly useful for doing things like quarterly updates of
> "theme" CDs or DVDs, such as Col Choat's idea of an "explorers"
> collection.

This will be great for the DVD team. I don't know about the users at 
large though.

Some people (not mirrors!) are roboting our whole site once a week in 
search for new books. I wonder how the DVD maker will scale under 
similar load conditions.

I was just wondering if it wasn't more realistic to use jigdo on the 
users side. People who burn DVDs do have a little knowledge so they 
could manage to install that.

Jigdo advantages: no big single chunk file transfers. jigdo will get the 
ebook files from the ftp server and build the DVD image on the users PC. 
On updates the user has to transfer just the changed files not the whole 
DVD image. By building our own jigdo files we could round robin the ftp 
load to different mirrors.

Jigdo disadvantages: user has to install the jigdo client. We have to 
somehow build a jigdo control file (but jigdo is open source, so we can 
figure that out.)


>>Where is this program supposed to run when it is ready?
> 
> On a beefy server.  Right now it's on snowy.arsc.alaska.edu, and I
> imagine snowy will be suitable for relatively large-scale use.  I hope
> the program will be available at other mirror sites, too.  I think it
> will be too intensive in disk & CPU for iBiblio, but you never know...
> if this sounds computationally unrealistic to offer to the general
> reader, to you, read my work .sig below :-)

Of course, if you throw a NetApp terabyte server at the problem...

You'll need a place to store all those custom DVD images until the user 
has retrieved them. (How to detect that? You can't rely on the user 
notifying you.) Retrieving DVD images has been a PITA even with fast DSL 
modems, so you'll have to save the images for at least a couple of days.


> Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center

That's a good idea. You will save big on your air-conditioning bill. :-)


-- 
Marcello Perathoner
webmaster@gutenberg.org

From gbnewby at pglaf.org  Fri Jun 10 09:26:42 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Fri Jun 10 09:26:44 2005
Subject: [gutvol-d] New "draft" DVD image
In-Reply-To: <42A9636D.5030301@perathoner.de>
References: <20050609075022.GA15457@pglaf.org> <42A80E03.6010700@perathoner.de>
	<20050609230426.GE1218@pglaf.org> <42A9636D.5030301@perathoner.de>
Message-ID: <20050610162642.GB27558@pglaf.org>

On Fri, Jun 10, 2005 at 11:54:53AM +0200, Marcello Perathoner wrote:
> Greg Newby wrote:
> 
> >For example, someone will be able to "save" their ISO configuration,
> >then return later to get *updated* files for the same eBooks.  This will
> >be particularly useful for doing things like quarterly updates of
> >"theme" CDs or DVDs, such as Col Choat's idea of an "explorers"
> >collection.
> 
> This will be great for the DVD team. I don't know about the users at 
> large though.
> 
> Some people (not mirrors!) are roboting our whole site once a week in 
> search for new books. I wonder how the DVD maker will scale under 
> similar load conditions.

We will see, but I don't think the DVD maker will be
robot-able at all.  There are also provisions for 
load balancing....for example, when a user has the CD/DVD
contents specified and says, "make me the ISO file,"
the ISO happens on an "as-available" basis, and the user
gets email when it's ready.

It's not going to be a viable tool for resource discovery.

> I was just wondering if it wasn't more realistic to use jigdo on the 
> users side. People who burn DVDs do have a little knowledge so they 
> could manage to install that.
> 
> Jigdo advantages: no big single chunk file transfers. jigdo will get the 
> ebook files from the ftp server and build the DVD image on the users PC. 
> On updates the user has to transfer just the changed files not the whole 
> DVD image. By building our own jigdo files we could round robin the ftp 
> load to different mirrors.
> 
> Jigdo disadvantages: user has to install the jigdo client. We have to 
> somehow build a jigdo control file (but jigdo is open source, so we can 
> figure that out.)

I'm 100% in favor of jigdo, and can set you up on snowy if
you (or someone else) would like to get it configured.
  -- Greg

> >>Where is this program supposed to run when it is ready?
> >
> >On a beefy server.  Right now it's on snowy.arsc.alaska.edu, and I
> >imagine snowy will be suitable for relatively large-scale use.  I hope
> >the program will be available at other mirror sites, too.  I think it
> >will be too intensive in disk & CPU for iBiblio, but you never know...
> >if this sounds computationally unrealistic to offer to the general
> >reader, to you, read my work .sig below :-)
> 
> Of course, if you throw a NetApp terabyte server at the problem...
> 
> You'll need a place to store all those custom DVD images until the user 
> has retrieved them. (How to detect that? You can't rely on the user 
> notifying you.) Retrieving DVD images has been a PITA even with fast DSL 
> modems, so you'll have to save the images for at least a couple of days.
> 
> 
> >Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center
> 
> That's a good idea. You will save big on your air-conditioning bill. :-)
> 
> 
> 
> -- 
> Marcello Perathoner
> webmaster@gutenberg.org
> 
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
From gbnewby at pglaf.org  Sun Jun 12 17:09:22 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Sun Jun 12 17:09:24 2005
Subject: [gutvol-d] WIPO Online Forum on Intellectual Property in the
	Information Society
In-Reply-To: <429DFF51.7080007@perathoner.de>
References: <429DFF51.7080007@perathoner.de>
Message-ID: <20050613000922.GA25595@pglaf.org>

Just a few more days to enter a comment.  My comment is below.

On Wed, Jun 01, 2005 at 08:32:49PM +0200, Marcello Perathoner wrote:
> Welcome to the Online Forum on Intellectual Property in the Information 
> Society, hosted by the World Intellectual Property Organization (WIPO) 
> from June 1 to 15, 2005.
> 		
> The WIPO Online Forum is designed to enable and encourage an open debate 
> on issues related to intellectual property in the information society, 
> and in light of the goals of the World Summit on the Information Society 
> (WSIS).  This presents a unique opportunity for all to engage in the 
> emerging debate on intellectual property in our day.
> 
> The 10 themes for discussion are listed below - scroll down to select a 
> theme.
> 
> The WIPO Online Forum is open to participation by all interested persons 
> ? you are invited to join in online discussions over a period of two 
> weeks from June 1, 2005. It is hoped that the Online Forum will further 
> inform the discussions taking place during the second phase of WSIS. 
> The conclusions of the Online Forum will form part of WIPO?s 
> contribution to the WSIS Tunis Summit.
> 

I posted here, in their "Public Domain" topic:
http://www.wipo.int/roller/comments/ipisforum/Weblog/theme_three_the_public_domain

What I posted:

I agree with many of the earlier comments that question the motivations
of WIPO's raising these questions.  Certainly the past history of WIPO's
role in copyright has shown their interests to be aligned with moneyed
interests.

Nevertheless, I offer a few comments.  As the URL suggests, I'm
affiliated with Project Gutenberg, which is an all-electronic library of
digitized works.  The vast majority of our 16,000+ titles are in the
public domain in the US.  We constantly strive to expand the
accessibility of public domain eBooks by seeking older literary works.
We also seek to identify public domain items that might not, at first
glance, appear to be public domain. These might include:

- items published from 1923-1964 in the US which did not have their
copyright renewed: these are public domain.

- items that are no longer commercially available or for which a
copyright owner cannot be identified.  Under the US Title 17 section
108(h), these may be public domain, and the US Librarian of Congress
seems interested in making them accessible.

- items published prior to 1989 without a copyright notice in the US:
these are public domain.

We believe there are more than adequate protections for copyright owners
to benefit from their works.  Unfortunately, copyright term extensions,
combined with unduly harsh penalties for copyright-related infringement
(especially in the US under the DMCA), has pushed the balance so that
the public domain is deemphasized.  Prior to 1998, one year's worth of
copyrighted items (from 1923, in that case) would enter the public
domain, even as the current year's items started their multi-year
journey under copyright protection.

But thanks to the copyright term extention of 1998, the most astounding
growth in the quantity of information in the world -- fueled by the
Internet -- has not been accompanied by any significant growth in the
public domain.

As others have pointed out in this topic, open source software and
creative commons licenses are welcome, but no substitute for the public
domain.  Such items still have the full force and duration of copyright
law.

Simply put, a healthy public domain is pre-requisite for support of the
creative arts.  It is very much possible to provide for ongoing
commercial potential for some works, while maintaining growth in the
public domain.  This can be accomplished in many ways, but the most
straightforward is to return to the need for active renewal of
copyrights beyond a modest term.  Such procedures would give the very
long copyright terms desired by moneyed interests, while the vast
majority of copyrighted items without such interests would enter the
public domain after a limited term.

WIPO's leadership role should include fostering a growing public domain.


  -- Greg
From jefferydouglaswaddell at gmail.com  Mon Jun 13 16:20:10 2005
From: jefferydouglaswaddell at gmail.com (Jeff Waddell)
Date: Mon Jun 13 16:20:24 2005
Subject: [gutvol-d] Greetings ebook makers ;)
In-Reply-To: <8a44f71c050613161325117bf3@mail.gmail.com>
References: <8a44f71c05061316037c592bbb@mail.gmail.com>
	<8a44f71c050613161325117bf3@mail.gmail.com>
Message-ID: <8a44f71c0506131620d308ebd@mail.gmail.com>

Hello fellow ebook creators,

Some of you may know me and many perhaps do not. Many long years ago I 
started a project called Kids Games which has spawned many newer projects 
and coordinated with other's. At the moment that project is basically 
defunct due to many factors. However I have been I have intentions to use 
the gutenberg project works with open source software to produce more 
educational software both for individuals and specifically for schools. I 
appreciate all the work that various members of this list have done in the 
past and look forward to what shall be created in the future. I recently 
rewrote my resume to reflect more deeply who I am and what I am about 
regarding my career. Because I feel that I will be utilizing the gutenberg 
project as one of the many resources to reach some of the goals that my 
career implies, I offer that document to all of you (please forward/share 
with anyone you deem it pertinent). You can access it at my personal website 
(www.spunge.org/~jwaddell <http://www.spunge.org/%7Ejwaddell>) just by 
following the links. It is there in 5 different formats and I hope one of 
them will work for you. Thank you for you time and I look forward to 
continuing this adventure of creating open source educational software for 
children.

Sincerely,

JefferyDouglasWaddell@gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050613/edfc4f71/attachment.html
From grendelkhan at gmail.com  Tue Jun 14 14:28:19 2005
From: grendelkhan at gmail.com (grendelkhan)
Date: Tue Jun 14 14:28:31 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg texts.
Message-ID: <26b51c32050614142824831367@mail.gmail.com>

I was having a discussion with my father, and I thought I would bring
it up on the mailing list, as it seems to be the place for it.

We'd just come out of our local Wal-Mart, and I'd noticed the
out-of-copyright books (classics and such) being sold for $6 to $11
each. I commented that folks could just download the books for free if
they wanted to read them, but he asked how many people owned a
computer, and how many of those had heard of Project Gutenberg?

So I did a bit of researching, and discovered that there exist "print
on demand" publishers, which instead of doing the offset-printing runs
of thousands and thousands of books, will, once a book has been
prepared and typeset, sometimes keep none at all in stock, and print
them only when ordered.

It seems that it would be a good idea to come up with some way to
offer the majority of PG's catalog through some method of
print-on-demand publishing, selling at-cost. Many Gutenberg works are
obscure, and not of general enough interest to warrant a print run
from a traditional publisher.

I'm aware that I could clearly run off and do this myself, but (a) I
wanted to get some feedback from the community at large, and (b)
print-on-demand publishing still requires start-up costs, and a
per-book "setup" fee of some kind, above and beyond the per-copy
materials cost. Given that PG has the Distributed Proofreaders to
provide lots and lots of work on worthy projects, and given that PGLAF
is a charitable organization which lots of people love, is there some
way to get around that issue?

Would it be worth it to provide a source of dead-tree editions of many
of the archive's works? Thoughts? Objections? Pointers to some guy
who's been doing this for the last ten years that I failed to Google
up?

--grendelkhan
From Bowerbird at aol.com  Tue Jun 14 15:43:44 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Jun 14 15:44:04 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg
	texts.
Message-ID: <128.5ed20c96.2fe0b7a0@aol.com>

grendelkhan said:

>   Would it be worth it to provide 
>   a source of dead-tree editions of 
>   many of the archive's works? 

"worth it" to whom?            :+)


>   Thoughts? 

it's a nice thought, to be sure.
one that many people have had.
but not an easy one to implement.


>   Objections?

it's very smart of you to ask.
so i will tell you some pitfalls.
i tell you this not to dissuade you,
so if you think you can make it work
anyway, just go right ahead and do it.

but know that these are some issues...

the start-up costs are very real,
so you won't get a p.o.d.-printer
to waive them, even for a charity.

and printed in runs of a few copies,
the books still are fairly expensive.

plus the shipping costs will eat you up.

in addition, if you just ran the books off
as they are -- as plain old ascii text --
people would largely turn their noses up.

there is a certain minimum standard that
we have come to expect from "a book", and
failure to meet that is a recipe for failure.

even the .html versions of the books won't
create a p-version that would be acceptable.

so you'd have to invest time/money/energy
in some desktop-publishing capabilities...

even after all that, in today's marketplace,
simply creating a product won't do very much.
today's customers are subjected to such heavy
marketing that they simply won't move at all
unless you bombard 'em with more of the same.

that means you'd have to do hype and marketing,
and probably pay for shelf-space in bookstores.

and by then, you've just become another publisher...

but you'd still lose out to the publishing houses,
because their versions would have slicker covers.

finally, if you ever _did_ make it work, by some
miracle or other, you should then expect to receive
vicious _flak_ from people who will _resent_ you,
because you're "getting rich" off their volunteer labor
and "selling something that should be given away free".

so unless you have a _very_ thick skin...


>   Pointers to some guy who's been doing this 
>   for the last ten years that I failed to Google up?

nobody has been doing it, for the reasons i listed.

that doesn't mean that nobody _will_ be doing it,
however.  if you're really serious about the idea,
see where daniel moynihan is working these days...

at blackmask.com, he demonstrated clearly that
a plain-text master-file can take you a long way.
he hasn't said so directly, but reading in between
the pages, i'm guessing he's going even farther now.          ;+)

-bowerbird
From ian at babcockbrown.com  Tue Jun 14 15:51:36 2005
From: ian at babcockbrown.com (Ian Stoba)
Date: Tue Jun 14 15:50:16 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <26b51c32050614142824831367@mail.gmail.com>
References: <26b51c32050614142824831367@mail.gmail.com>
Message-ID: <618BC402-468D-4B38-A35E-9AA5AAADA704@babcockbrown.com>

I read your message and realized that I just might be the "some guy"  
you were talking about. A while back (probably 10 - 12 years ago) I  
spoke with some professors about using PG texts in their classes. I  
thought that ebooks would be a good alternative to overpriced short  
press runs aimed at impoverished college students. At that time, the  
response I got from everyone I spoke to was that the quality of PG  
texts was just not high enough for academics to endorse them as a  
teaching tool.

After hearing that, I thought very seriously about starting a  
publishing business that would match up young professors (badly in  
need of publishing credits in their hope of becoming tenured) with PG  
texts and bringing out edited, and possibly annotated, editions. I  
suspected that this could be done at a very reasonable cost and would  
be a benefit to the students as well as the professors.

I ended up not pursuing that idea, and it's probably just as well  
that I didn't. A lot has changed in the past decade, notably:

     - Thanks to Distributed Proofreaders the accuracy of PG texts  
has increased -enormously-, likewise the breadth of the PG collection.

     - I learned that annotated editions would likely encumber public  
domain work with newly copyrighted material. This would limit  
students' ability to modify and redistribute the materials.

     - The rise of Creative Commons and Science Commons has given  
academics many new venues to publish outside of the mainstream presses.

     - Print on demand has become a viable business, with press runs  
of one copy now being profitable.

     - Brewster Kahle's bookmobile was just flat out cooler than  
anything I ever imagined, and it works really well.

With all that said, I -still- this it would be great to have good  
quality editions of public domain works available at a reasonable  
cost to students and anyone else who doesn't want to overpay for a  
book. The limitation now in creating a print on demand service for PG  
books is that the main POD publishers tend to want an upfront fee to  
cover their setup and storage costs. In many cases, this may be about  
$500 per title. My guess is that this cost would be prohibitive for  
PG. I do not know if any POD publishers have expressed interest in  
waiving this fee for Project Gutenberg.

Also, some PG volunteers are working on a standard system for  
publishing texts in a markup language called TEI-Lite. As I  
understand it, this markup language (it's a dialect of XML) would  
make it much easier to offer electronic texts in a variety of  
formats, including some that would be suitable for printing and  
binding. This markup would largely replace the academic editor I had  
imagined.

If you decide to go forward with this idea, I would be very  
interested to hear more. I'm not interested in running a publishing  
house at this point in my life, but I would certainly want to order  
some books!

--Ian


On Jun 14, 2005, at 2:28 PM, grendelkhan wrote:

> I was having a discussion with my father, and I thought I would bring
> it up on the mailing list, as it seems to be the place for it.
>
> We'd just come out of our local Wal-Mart, and I'd noticed the
> out-of-copyright books (classics and such) being sold for $6 to $11
> each. I commented that folks could just download the books for free if
> they wanted to read them, but he asked how many people owned a
> computer, and how many of those had heard of Project Gutenberg?
>
> So I did a bit of researching, and discovered that there exist "print
> on demand" publishers, which instead of doing the offset-printing runs
> of thousands and thousands of books, will, once a book has been
> prepared and typeset, sometimes keep none at all in stock, and print
> them only when ordered.
>
> It seems that it would be a good idea to come up with some way to
> offer the majority of PG's catalog through some method of
> print-on-demand publishing, selling at-cost. Many Gutenberg works are
> obscure, and not of general enough interest to warrant a print run
> from a traditional publisher.
>
> I'm aware that I could clearly run off and do this myself, but (a) I
> wanted to get some feedback from the community at large, and (b)
> print-on-demand publishing still requires start-up costs, and a
> per-book "setup" fee of some kind, above and beyond the per-copy
> materials cost. Given that PG has the Distributed Proofreaders to
> provide lots and lots of work on worthy projects, and given that PGLAF
> is a charitable organization which lots of people love, is there some
> way to get around that issue?
>
> Would it be worth it to provide a source of dead-tree editions of many
> of the archive's works? Thoughts? Objections? Pointers to some guy
> who's been doing this for the last ten years that I failed to Google
> up?
>
> --grendelkhan
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>


This email message may contain information that is confidential and proprietary to Babcock & Brown or a third party. If you are not the intended recipient, please contact the sender and destroy the original and any copies of the original message. Babcock & Brown takes measures to protect the content of its communications. However, Babcock & Brown cannot guarantee that email messages will not be intercepted by third parties or that email messages will be free of errors or viruses. 

If you do not wish to receive any further e-mail from Babcock & Brown, please send an email to opt-out@babcockbrown.com.
From cannona at fireantproductions.com  Tue Jun 14 16:44:56 2005
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Tue Jun 14 16:49:04 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of
	Gutenberg texts.
In-Reply-To: <26b51c32050614142824831367@mail.gmail.com>
References: <26b51c32050614142824831367@mail.gmail.com>
Message-ID: <6.2.1.2.0.20050614184411.041802a8@mail.fireantproductions.com>

There was some talk about working with lulu.com, but I'm not sure where 
that ended up.  Good luck.

Sincerely
aaron Cannon


At 04:28 PM 6/14/2005, you wrote:
>I was having a discussion with my father, and I thought I would bring
>it up on the mailing list, as it seems to be the place for it.
>
>We'd just come out of our local Wal-Mart, and I'd noticed the
>out-of-copyright books (classics and such) being sold for $6 to $11
>each. I commented that folks could just download the books for free if
>they wanted to read them, but he asked how many people owned a
>computer, and how many of those had heard of Project Gutenberg?
>
>So I did a bit of researching, and discovered that there exist "print
>on demand" publishers, which instead of doing the offset-printing runs
>of thousands and thousands of books, will, once a book has been
>prepared and typeset, sometimes keep none at all in stock, and print
>them only when ordered.
>
>It seems that it would be a good idea to come up with some way to
>offer the majority of PG's catalog through some method of
>print-on-demand publishing, selling at-cost. Many Gutenberg works are
>obscure, and not of general enough interest to warrant a print run
>from a traditional publisher.
>
>I'm aware that I could clearly run off and do this myself, but (a) I
>wanted to get some feedback from the community at large, and (b)
>print-on-demand publishing still requires start-up costs, and a
>per-book "setup" fee of some kind, above and beyond the per-copy
>materials cost. Given that PG has the Distributed Proofreaders to
>provide lots and lots of work on worthy projects, and given that PGLAF
>is a charitable organization which lots of people love, is there some
>way to get around that issue?
>
>Would it be worth it to provide a source of dead-tree editions of many
>of the archive's works? Thoughts? Objections? Pointers to some guy
>who's been doing this for the last ten years that I failed to Google
>up?
>
>--grendelkhan
>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d


--
E-mail: cannona@fireantproductions.com
Skype: cannona
MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) 


From Bowerbird at aol.com  Wed Jun 15 00:44:23 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Jun 15 00:44:45 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
Message-ID: <62.5703acdd.2fe13657@aol.com>


recently i've worked on
"roundtripping" styled
z.m.l. text through a .pdf.

my viewer-program can
write z.m.l. text to a .pdf
such that copying the text
out of the .pdf gives a user
the same text that went in.

make a few global changes --
which restores the whitespace
acrobat usually strips from text 
-- and you can load the text back
into my z.m.l. viewer-program and
generate the same .pdf once again...

the proof is in the pudding, and
the structure is in the presentation.

that is _not_ something you can do 
with text that's copied out of a .pdf
created with other programs i know.

generally, the .pdf format is known
as the "roach motel" of file-formats
-- content goes in and can't get out...       :+)

-bowerbird
From hart at pglaf.org  Wed Jun 15 08:14:09 2005
From: hart at pglaf.org (Michael Hart)
Date: Wed Jun 15 08:14:11 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <26b51c32050614142824831367@mail.gmail.com>
References: <26b51c32050614142824831367@mail.gmail.com>
Message-ID: <Pine.LNX.4.60.0506150809130.17620@pglaf.org>


On Tue, 14 Jun 2005, grendelkhan wrote:

> I was having a discussion with my father, and I thought I would bring
> it up on the mailing list, as it seems to be the place for it.
>
> We'd just come out of our local Wal-Mart, and I'd noticed the
> out-of-copyright books (classics and such) being sold for $6 to $11
> each. I commented that folks could just download the books for free if
> they wanted to read them, but he asked how many people owned a
> computer, and how many of those had heard of Project Gutenberg?

There have been over a billion computers in use in the world for
some time now, and thus well over a billion computer users.

In the US the computer saturation rate is somewhere around ~7/8
of all US households.  [Anyone have the latest figures?]

Not to mention that ~3/4 of these households have hi-speed access.

As to how many of these people know about Project Gutenberg,
that's hard to measure. . .perhaps we should do a survey.

As for the rest of the world, the US is far from being the
most saturated in terms of either computers or access, and
in some lists doesn't even make the top ten. . .for some
reason the Scandiavian countries seemed to beat us there.

More later,

Michael

From distributedmel at gmail.com  Wed Jun 15 08:35:00 2005
From: distributedmel at gmail.com (Melissa Er-Raqabi)
Date: Wed Jun 15 08:35:10 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <Pine.LNX.4.60.0506150809130.17620@pglaf.org>
References: <26b51c32050614142824831367@mail.gmail.com>
	<Pine.LNX.4.60.0506150809130.17620@pglaf.org>
Message-ID: <a5a2a5020506150835141d400@mail.gmail.com>

Michael, where are you getting these numbers? Can you provide some sources 
please? I find them rather incredible. 

Melissa

On 6/15/05, Michael Hart <hart@pglaf.org> wrote:
> 
> 
> 
> On Tue, 14 Jun 2005, grendelkhan wrote:
> 
> > I was having a discussion with my father, and I thought I would bring
> > it up on the mailing list, as it seems to be the place for it.
> >
> > We'd just come out of our local Wal-Mart, and I'd noticed the
> > out-of-copyright books (classics and such) being sold for $6 to $11
> > each. I commented that folks could just download the books for free if
> > they wanted to read them, but he asked how many people owned a
> > computer, and how many of those had heard of Project Gutenberg?
> 
> There have been over a billion computers in use in the world for
> some time now, and thus well over a billion computer users.
> 
> In the US the computer saturation rate is somewhere around ~7/8
> of all US households. [Anyone have the latest figures?]
> 
> Not to mention that ~3/4 of these households have hi-speed access.
> 
> As to how many of these people know about Project Gutenberg,
> that's hard to measure. . .perhaps we should do a survey.
> 
> As for the rest of the world, the US is far from being the
> most saturated in terms of either computers or access, and
> in some lists doesn't even make the top ten. . .for some
> reason the Scandiavian countries seemed to beat us there.
> 
> More later,
> 
> Michael
> 
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050615/94c22c81/attachment-0001.html
From grendelkhan at gmail.com  Wed Jun 15 13:22:58 2005
From: grendelkhan at gmail.com (grendelkhan)
Date: Wed Jun 15 13:23:08 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
Message-ID: <26b51c32050615132219efb74e@mail.gmail.com>

Thanks to everyone for their comments so far! I'm learning quite a bit as I go.

As of October 2003, the US Commerce department reported that about
three-fifths of households had a computer; a little over half had
internet access.

https://www.esa.doc.gov/Reports/NationOnlineBroadband04.htm

So it's not as bad as I was led to believe. Still, 

Perhaps I should have stated my goals a little more clearly. I have no
particular interest in making money or making a business out of this.
I'd simply like to make the books available---through whatever means
that may be---in dead-tree form. I suppose it's a terrible idea fo tie
the actual Project to a commercial entity by developing a working
relationship with them---I don't think an "Official Project Gutenberg
Edition" is a good idea.

lulu.com, as mentioned, has no setup fees, but their pricing is a mite
stiff---$4.53 plus $0.02/page. Certainly better than buying stuff from
most university presses, but not exactly bargain-basement. Lightning
Source charges (based on some quick googling at
http://com1.runboard.com/bthescribesmessageboard.fwritingarchives.t45%7Coffset=15
), $0.90 plus $0.013 per page, but I don't know what kind of binding
that requires, or what sort of setup fees they charge. Perhaps they'd
waive them if DP put out some sort of print-ready version in addition
to human-readable text. I'm thinking TeX->PDF here, as it's pretty
much the stablest human-readable-yet-fully-marked-up format available.
Thoughts? I suppose I should take a relatively short etext, mark it up
and see how it looks.

I concur that simply throwing plain text, or even decent HTML, at
paper is a horrible idea. So, what I ask is---is there a way to
prepare the etexts as, in addition to HTML, whatever format is
print-ready for these machines? Since typesetting a ready copy is a
simple matter of feeding it to a Xerox DocuTech or whatever the
$100,000 piece of hardware the print shop uses is, how can we do the
necessary preprocessing ourselves? What exactly does the "setup fee"
include?

Thanks to everyone again for being so helpful with this.

--grendelkhan
From hyphen at hyphenologist.co.uk  Wed Jun 15 13:35:08 2005
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Wed Jun 15 13:35:31 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <26b51c32050615132219efb74e@mail.gmail.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
Message-ID: <8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>

On Wed, 15 Jun 2005 16:22:58 -0400,  grendelkhan <grendelkhan@gmail.com>
wrote:

| Thanks to everyone for their comments so far! I'm learning quite a bit as I go.
| 
| As of October 2003, the US Commerce department reported that about
| three-fifths of households had a computer; a little over half had
| internet access.
| 
| https://www.esa.doc.gov/Reports/NationOnlineBroadband04.htm
| 
| So it's not as bad as I was led to believe. Still, 
| 
| Perhaps I should have stated my goals a little more clearly. I have no
| particular interest in making money or making a business out of this.
| I'd simply like to make the books available---through whatever means
| that may be---in dead-tree form. I suppose it's a terrible idea fo tie
| the actual Project to a commercial entity by developing a working
| relationship with them---I don't think an "Official Project Gutenberg
| Edition" is a good idea.
| 
| lulu.com, as mentioned, has no setup fees, but their pricing is a mite
| stiff---$4.53 plus $0.02/page. Certainly better than buying stuff from
| most university presses, but not exactly bargain-basement. Lightning
| Source charges (based on some quick googling at
| http://com1.runboard.com/bthescribesmessageboard.fwritingarchives.t45%7Coffset=15
| ), $0.90 plus $0.013 per page, but I don't know what kind of binding
| that requires, or what sort of setup fees they charge. Perhaps they'd
| waive them if DP put out some sort of print-ready version in addition
| to human-readable text. I'm thinking TeX->PDF here, as it's pretty
| much the stablest human-readable-yet-fully-marked-up format available.
| Thoughts? I suppose I should take a relatively short etext, mark it up
| and see how it looks.
| 
| I concur that simply throwing plain text, or even decent HTML, at
| paper is a horrible idea. So, what I ask is---is there a way to
| prepare the etexts as, in addition to HTML, whatever format is
| print-ready for these machines? Since typesetting a ready copy is a
| simple matter of feeding it to a Xerox DocuTech or whatever the
| $100,000 piece of hardware the print shop uses is, how can we do the
| necessary preprocessing ourselves? What exactly does the "setup fee"
| include?

Just a mention that all Europe uses A4 paper.
Anything designed solely for American paper sizes will be useless to
typesetters in Europe.


-- 
Dave Fawthrop <dave hyphenologist co uk> http://www.webshots.com 
Thousands of wonderful professional photos for your Wallpaper and 
Screensaver. also 200,000 amateur pics. Four new pics each day.

From prosfilaes at gmail.com  Wed Jun 15 13:39:26 2005
From: prosfilaes at gmail.com (David Starner)
Date: Wed Jun 15 13:39:37 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
Message-ID: <6d99d1fd050615133945b30ce3@mail.gmail.com>

On 6/15/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
> Just a mention that all Europe uses A4 paper.
> Anything designed solely for American paper sizes will be useless to
> typesetters in Europe.

Why? With decent margins, you can print letter on A4 or vice versa.
And we aren't really talking typesetters here; typesetters don't care
why size paper it is, since they're going to rip it apart and re-set
it anyway. We're talking about people who are dumping our preformed
blob to paper.
From grendelkhan at gmail.com  Wed Jun 15 14:09:47 2005
From: grendelkhan at gmail.com (grendelkhan)
Date: Wed Jun 15 14:09:58 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
Message-ID: <26b51c3205061514095c5e8f8@mail.gmail.com>

On 6/15/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
> Just a mention that all Europe uses A4 paper.
> Anything designed solely for American paper sizes will be useless to
> typesetters in Europe.

I was planning on 6"x9", which I think is the standard trade paperback
size. Except... hmm.

http://www.cafepress.com/cp/info/help/learn_book_info.aspx

Cafe Press states that 4.18in x 6.88in is the standard 'Mass Market
Paperback' size. Also 5in x 8in for 'Standard Paperback'.

http://www.whitehallprinting.com/TrimSize.html

Some random printing company lists 6x9 and 5.5x8.5 as 'Standard Trim Sizes'.

http://www.powerhomebiz.com/vol93/selfpublishing2.htm

Another random tutorial lists 6x9 and 5-3/8x8.

http://www.josephzitt.com/books/smwb-howto.php#pod

Says here that apparently the 6x9 format is standard, at least with
Lightning Source.

Ah, and Cafe Press offers printing for $7 plus $0.03 per page with no
setup fees. So, probably not the cheapest option. Perhaps I'll prep
something and approach Lightning Source asking what they need in the
way of preparation supplies---that is, what can be done for them.

Is 6x9 a standard paperback size in Europe? I suppose that's of less
interest. I'll be measuring some of my paperbacks at home this evening
once I get back from work. Maybe print and trim a few test pages or
something.

--grendelkhan
From Bowerbird at aol.com  Wed Jun 15 14:43:44 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Jun 15 14:44:07 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
Message-ID: <15d.52dcf2d7.2fe1fb10@aol.com>

grendelkhan said:
>   Perhaps I should have stated my goals a little more clearly. 
>   I have no particular interest in making money 
>   or making a business out of this.

but it costs money to do it.  so you 
have to "make a business out of it"
in order to do it in the first place...

unless you have a lot of money to throw down the toilet.

and you don't have to worry about "making money",
because unless some big miracle strikes, or you are 
particularly clever about how you go about it, you won't
make any money.  you're far more likely to lose your shirt.
so "avoiding the loss of money" is your real objective here.
if you can't afford to lose any money, you should stay away.


>   I'd simply like to make the books available
>   ---through whatever means that may be---
>   in dead-tree form. 

right.  but you can't just wave a wand and make it happen.


>   lulu.com, as mentioned, has no setup fees, 
>   but their pricing is a mite stiff---$4.53 plus $0.02/page. 

that $4.53 _is_ a setup fee, whether they call it that or not.
and since it's $4.53 _per_ book, it's a rather high one at that.
(if you really want the best p.o.d. price, i'll dig that up for you.
there's one site offering quotes based on several p.o.d. places.)


>   I'm thinking TeX->PDF here

ok, but who puts all the e-texts into tex format?
that's a real cost, a very real cost, and it's huge.


>   Thoughts? I suppose I should take a relatively short etext, 
>   mark it up and see how it looks.

why a "relatively short" one?  that'll just
lead you to underestimate the actual cost,
which is the best way to lose money fast.

mark up one of average size and difficulty,
and then multiply it by about 11,000, and
then you'll have a good idea of the true cost.


>   So, what I ask is---is there a way to

>   prepare the etexts as, in addition to HTML, 
>   whatever format is print-ready for these machines?

there will be, very shortly, yes -- my viewer-program.

given some minor editing of an e-text for consistency,
usually 5-10 minutes for most e-texts, it will format
the book according to the user's specifications (as to
font, size, leading, colors, and paper-size, which dave
mentioned in regard to european users) and create a .pdf.

putting this program into the hands of end-users, so
they can create their own output, to their own specs,
and print it out on their own machines, is one route
to giving them hard-copy versions of all the e-texts.

it's not the only route, but since it puts all the power
and all the _costs_ on their shoulders, it is likely to
be the one that gets implemented more than others...

-bowerbird
From hyphen at hyphenologist.co.uk  Wed Jun 15 23:51:54 2005
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Wed Jun 15 23:52:27 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <6d99d1fd050615133945b30ce3@mail.gmail.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
	<6d99d1fd050615133945b30ce3@mail.gmail.com>
Message-ID: <f782b1dieuk8v2135bs7ka5fggotsdft74@4ax.com>

On Wed, 15 Jun 2005 15:39:26 -0500,  David Starner <prosfilaes@gmail.com>
wrote:

| On 6/15/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
| > Just a mention that all Europe uses A4 paper.
| > Anything designed solely for American paper sizes will be useless to
| > typesetters in Europe.
| 
| Why? With decent margins, you can print letter on A4 or vice versa.
| And we aren't really talking typesetters here; typesetters don't care
| why size paper it is, since they're going to rip it apart and re-set
| it anyway. We're talking about people who are dumping our preformed
| blob to paper.

Please note the Subject of this thread:
Print-on-demand and dead-tree copies of Gutenberg texts.

IMO felling trees is a Bad Idea, especially when with a little thought
fewer trees could be felled.

-- 
Dave Fawthrop <dave hyphenologist co uk> http://www.webshots.com 
Thousands of wonderful professional photos for your Wallpaper and 
Screensaver. also 200,000 amateur pics. Four new pics each day.

From hyphen at hyphenologist.co.uk  Thu Jun 16 00:22:38 2005
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Thu Jun 16 00:23:13 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <26b51c3205061514095c5e8f8@mail.gmail.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
	<26b51c3205061514095c5e8f8@mail.gmail.com>
Message-ID: <gh82b1ha71ejio7q2eu7dclnivl6f4a5cc@4ax.com>

On Wed, 15 Jun 2005 17:09:47 -0400,  grendelkhan <grendelkhan@gmail.com>
wrote:

| On 6/15/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
| > Just a mention that all Europe uses A4 paper.
| > Anything designed solely for American paper sizes will be useless to
| > typesetters in Europe.
| 
| I was planning on 6"x9", which I think is the standard trade paperback
| size. Except... hmm.

But absolutely nobody uses inches any more, at least where I live.
I was using a GPS which gives some measurements in feet, last weekend and
found that I could not envisage how long a foot was.  Even though I spent
*more* than half my life using those insane measurements, ft, ins, lb,
gallons (not US), perch, pole, peck, and so on.   

| http://www.cafepress.com/cp/info/help/learn_book_info.aspx

San Leandro, California 94577

| Cafe Press states that 4.18in x 6.88in is the standard 'Mass Market
| Paperback' size. Also 5in x 8in for 'Standard Paperback'.
| 
| http://www.whitehallprinting.com/TrimSize.html
| 
Naples, FL 34104 USA

| Some random printing company lists 6x9 and 5.5x8.5 as 'Standard Trim Sizes'.
| 
| http://www.powerhomebiz.com/vol93/selfpublishing2.htm

Virginia, USA
 
| Another random tutorial lists 6x9 and 5-3/8x8.
| 
| http://www.josephzitt.com/books/smwb-howto.php#pod

Berkeley, CA 94709

| Says here that apparently the 6x9 format is standard, at least with
| Lightning Source.
| 
| Ah, and Cafe Press offers printing for $7 plus $0.03 per page with no
| setup fees. So, probably not the cheapest option. Perhaps I'll prep
| something and approach Lightning Source asking what they need in the
| way of preparation supplies---that is, what can be done for them.
| 
| Is 6x9 a standard paperback size in Europe? 

No!  A5 usually

| I suppose that's of less
| interest. I'll be measuring some of my  

?American?

| paperbacks at home this evening
| once I get back from work. Maybe print and trim a few test pages or
| something.

Anything but A4 and A3 paper is *impossible*, for your ordinary person to
get in Europe.

The jobbing Printers use A0 sheets.
Printing  anything but A sizes produces waste trimmings :-(


http://www.cl.cam.ac.uk/~mgk25/iso-paper.html
>>>
International standard paper sizes

Standard paper sizes like ISO A4 are widely used all over the world today.
This text explains the ISO 216 paper size system and the ideas behind its
design. 

Globalization starts with getting the details right.
Inconsistent use of SI units and international
standard paper sizes remain today a primary
cause for U.S. businesses failing to meet
the expectations of customers worldwide.
<<<
Basically fold/cut A0 in two and you get A1. *with No Waste*  

What I am suggesting is that the design should be A5 for the world, with an
alternative for US use.

-- 
Dave Fawthrop <dave hyphenologist co uk> http://www.webshots.com 
Thousands of wonderful professional photos for your Wallpaper and 
Screensaver. also 200,000 amateur pics. Four new pics each day.

From Bowerbird at aol.com  Thu Jun 16 02:27:00 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 16 02:27:28 2005
Subject: [gutvol-d] is that 60 gigs in your pocket?
Message-ID: <b9.5a09ee66.2fe29fe4@aol.com>


it's a photo-ipod in my pocket,
but yes, i am glad to see you...

-bowerbird
From hacker at gnu-designs.com  Thu Jun 16 06:04:50 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Thu Jun 16 06:05:31 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <62.5703acdd.2fe13657@aol.com>
References: <62.5703acdd.2fe13657@aol.com>
Message-ID: <Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>


> make a few global changes -- which restores the whitespace acrobat 
> usually strips from text -- and you can load the text back into my 
> z.m.l. viewer-program and generate the same .pdf once again...

	Acrobat doesn't store text in PDFs, they store pixels and 
vectors and OCR'd coordinates. Most-definately not text.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From jonathan.gorman at gmail.com  Thu Jun 16 08:02:19 2005
From: jonathan.gorman at gmail.com (Jon Gorman)
Date: Thu Jun 16 08:02:53 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
Message-ID: <4a6dc7605061608024b004c32@mail.gmail.com>

On 6/16/05, David A. Desrosiers <hacker@gnu-designs.com> wrote:
> 
>         Acrobat doesn't store text in PDFs, they store pixels and
> vectors and OCR'd coordinates. Most-definately not text.

Ummm....so that's why Chapter 5 of the reference is all about text? 
Seriously though, it is  possible to put text into pdfs.  That's why
you can copy and paste out of them.  Granted, there are a lot of
places that just scan in material and post that, but it is not the
only thing that you can do with pdfs.  PDF is derived from postscript
after all.

Unless I missed something in the conversation, in which case I'm
sorry.  Or you're being sarcastic and I just misread ;).  Just didn't
want anyone to be mislead.

Jon Gorman
From hacker at gnu-designs.com  Thu Jun 16 08:19:54 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Thu Jun 16 08:20:32 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <4a6dc7605061608024b004c32@mail.gmail.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>


> Seriously though, it is possible to put text into pdfs.  That's why 
> you can copy and paste out of them.  Granted, there are a lot of 
> places that just scan in material and post that, but it is not the 
> only thing that you can do with pdfs.  PDF is derived from 
> postscript after all.

	Just because you can put down a cursor and go from one x,y to 
another x,y does not mean you are "selecting" what is visible on the 
screen, as your human eyes see it.

	PDF is pure layout, no structure. Tables are positioned text 
and lines, columns are positioned text... its basically OCR, without 
any character detection.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From jonathan.gorman at gmail.com  Thu Jun 16 08:52:50 2005
From: jonathan.gorman at gmail.com (Jon Gorman)
Date: Thu Jun 16 08:53:01 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
Message-ID: <4a6dc760506160852491a30b5@mail.gmail.com>

>         Just because you can put down a cursor and go from one x,y to
> another x,y does not mean you are "selecting" what is visible on the
> screen, as your human eyes see it.

Whoever said they were human eyes ;).  Seriously though, while there
are always encoding issues and the like, given a reasonable
application/clipboard type the region you select should be what is
visible, so I'm not sure what you're suggesting.  Are you just making
the point that when I select the text it's converting an essentially
drawn image into an encoding text.  But of course anything displayed
on the monitor or printed out could be argued to be just pixels and/or
vectors.


> 
>         PDF is pure layout, no structure. Tables are positioned text
> and lines, columns are positioned text... its basically OCR, without
> any character detection.

Sorry, just guess I'm confused.  You said there was no text in pdfs
(implying to me just images).  Chapter 5 of the Reference has a lot of
info of how to include text.  I'm not sure what OCR (Optical Character
Recognition) means if it doesn't do any character detection...

I didn't see any mention of structure anywhere in the email you sent. 
Just that it was impossible to have text. Which is odd since there are
regions of text in a pdf document with instructions on how to draw
that text.  They can be encoded or just inserted when creating the
document.  Granted, the encoded streams are a bit of a pain, but
they're arguably just as much text as any other

I'm not trying to make a mountain of a molehill here.  Just didn't
want some people to get the impression that pdfs were solely
graphic-orientated (like say...jpeg).  Perhaps we have different ideas
of text.

Seriously, no offense to anyone.  Just wanted to clarify things.  I'm
skeptical about bowerbird's claims as well, but it's misleading to say
that Acrobat doesn't store text in the document.  It is possible to
make the text rather obscure, but that doesn't mean that if formatted
correctly you could not scan through the file in a text editor and
read it.  Granted, it's rarely done, but doesn't mean it's impossible.

Jon


> 
> 
> 
> David A. Desrosiers
> desrod@gnu-designs.com
> http://gnu-designs.com
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
From Bowerbird at aol.com  Thu Jun 16 09:53:18 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 16 09:53:32 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
Message-ID: <1e8.3d51866f.2fe3087e@aol.com>

david said:
>   Acrobat doesn't store text in PDFs, they store pixels and 
>   vectors and OCR'd coordinates. Most-definately not text.

you must be making some kind of semantic argument that
i don't grasp.  because i can copy out my text just peachy.
and most people can copy text out of many .pdfs just fine.

it typically loses a good deal of its formatting, and it is
not unusual for chunks of it to be ordered "out of place",
and the users' ability to copy out text _can_ be disabled,
or subverted in other ways (i.e., by converting text to 
an image format before writing it to the .pdf originally)
but the experience of copying text from a .pdf is common.

however, if you'd like to explain the point you're making,
whether it is semantic or otherwise, do please feel free.       :+)
it probably won't matter much to me, but i don't mind
keeping my brain exercised by doing a little thinking...

-bowerbird
From tim at tmeekins.com  Thu Jun 16 10:03:29 2005
From: tim at tmeekins.com (Tim Meekins)
Date: Thu Jun 16 10:03:47 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
Message-ID: <021401c57295$524a6cf0$3201a8c0@pink>

Wrong! PDF most definately stores text.

----- Original Message ----- 
From: "David A. Desrosiers" <hacker@gnu-designs.com>
To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org>
Sent: Thursday, June 16, 2005 6:04 AM
Subject: Re: [gutvol-d] roundtripping formatted text through a .pdf


> 
>> make a few global changes -- which restores the whitespace acrobat 
>> usually strips from text -- and you can load the text back into my 
>> z.m.l. viewer-program and generate the same .pdf once again...
> 
> Acrobat doesn't store text in PDFs, they store pixels and 
> vectors and OCR'd coordinates. Most-definately not text.
> 
> 
> David A. Desrosiers
> desrod@gnu-designs.com
> http://gnu-designs.com
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
From marcello at perathoner.de  Thu Jun 16 10:18:22 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Jun 16 10:18:31 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
Message-ID: <42B1B45E.8030806@perathoner.de>

David A. Desrosiers wrote:

> Acrobat doesn't store text in PDFs, they store pixels and vectors and
> OCR'd coordinates. Most-definately not text.

You are most definitely wrong there. How else would the "find" function
work?

Here's an example of a pdf file contents:

/F23 17.215 Tf 56.693 509.046 Td[(Chapter)-250(I)]TJ/F23 24.787 Tf 0
-74.229 Td[(Do)10(wn)-250(the)-250(Rab)10(bit-Hole)]TJ/F20 11.955 Tf 0
-44.334
Td[(Alice)-300(w)10(as)-299(be)15(ginning)-300(to)-299(get)-300(v)15(ery)-299(tired)-300(of)-300(sitting)-299(by)-300(her)-299(sister)-300(on)]TJ
0 -14.446
Td[(the)-354(bank,)-380(and)-354(of)-354(ha)20(ving)-354(nothing)-353(to)-354(do:)-518(once)-354(or)-354(twice)-354(she)-354(had)]TJ
0 -14.446
Td[(peeped)-198(into)-199(the)-198(book)-199(her)-198(sister)-199(w)10(as)-198(reading,)-209(b)20(ut)-198(it)-199(had)-198(no)-199(pictures)]TJ
0 -14.446
Td[(or)-321(con)40(v)15(ersations)-321(in)-321(it,)-339(`and)-321(what)-321(is)-321(the)-321(use)-321(of)-321(a)-321(book,')-339(thought)]TJ
0 -14.445
Td[(Alice)-250(`without)-250(pictures)-250(or)-250(con)40(v)15(ersation?')]TJ


You see that all the text is there. Spaces are simulated by horizontal 
movement and kernings also. It would not be too difficult to write a 
perl script to recover the text out of the pdf.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From jhowse at nf.sympatico.ca  Thu Jun 16 15:10:17 2005
From: jhowse at nf.sympatico.ca (JHowse)
Date: Thu Jun 16 10:41:12 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <42B1B45E.8030806@perathoner.de>
References: <Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
Message-ID: <5.1.0.14.0.20050616150642.00a688a0@pop1.nf.sympatico.ca>

At 07:18 PM 16/06/05 +0200, you wrote:
>David A. Desrosiers wrote:
>
>>Acrobat doesn't store text in PDFs, they store pixels and vectors and
>>OCR'd coordinates. Most-definately not text.
>
>You are most definitely wrong there. How else would the "find" function
>work?

[snip]

And fonts are imbedding into a pdf file!


>You see that all the text is there. Spaces are simulated by horizontal 
>movement and kernings also. It would not be too difficult to write a perl 
>script to recover the text out of the pdf.

or if you have the full adobe acrobat programme you can simply export to a 
rtf file. I did that sort of thing at work for three years. You may have to 
do some formatting to pretty it up, but it's definitely text.

JHowse


                        ================================================================================
                        "I'm not likely to write a great novel or compose a 
song or save a baby from a burning building...but I can help
                         make sure that there is an electronic library of 
free knowledge available for future people to access."--jhutch.
                                                                        Preserving 
History One Page at a Time!!
                                                             Celebrating 
our 6750th book posted to Project Gutenberg
                                                  Join Project Gutenberg's 
Distributed Proofreaders http://www.pgdp.net/c/
                        ================================================================================

From hacker at gnu-designs.com  Thu Jun 16 10:43:14 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Thu Jun 16 10:43:37 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <4a6dc760506160852491a30b5@mail.gmail.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
	<4a6dc760506160852491a30b5@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>


> It is possible to make the text rather obscure, but that doesn't 
> mean that if formatted correctly you could not scan through the file 
> in a text editor and read it.  Granted, it's rarely done, but 
> doesn't mean it's impossible.

	I just ran strings(1) across about 40 of the PDFs I have here 
from various clients, online resources and PDFs I've created in 
Windows and with OpenOffice.org, and not a single one contained any 
readible strings that are actually in the _content_ of the documents 
themselves, other than the strings which comprise URLs embedded in the 
document itself.

	So where is the text of the document stored? If its somewhere 
in here, why is it obfuscated by default, in every single PDF I have?

	The document content itself is most-definitely NOT stored as 
"plain text" in the pdf documents I have here, which is a pretty broad 
sample set.


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From marcello at perathoner.de  Thu Jun 16 11:07:46 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Jun 16 11:07:55 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>	<4a6dc7605061608024b004c32@mail.gmail.com>	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>	<4a6dc760506160852491a30b5@mail.gmail.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
Message-ID: <42B1BFF2.1040905@perathoner.de>

David A. Desrosiers wrote:

> I just ran strings(1) across about 40 of the PDFs I have here from
> various clients, online resources and PDFs I've created in Windows
> and with OpenOffice.org, and not a single one contained any readible
> strings that are actually in the _content_ of the documents 
> themselves, other than the strings which comprise URLs embedded in
> the document itself.
> 
> So where is the text of the document stored? If its somewhere in
> here, why is it obfuscated by default, in every single PDF I have?
> 
> The document content itself is most-definitely NOT stored as "plain
> text" in the pdf documents I have here, which is a pretty broad 
> sample set.

A pdf is a chunked file format and each chunk can be compressed or even 
encrypted. A run-of-the-mill pdf is always at least compressed.

If you create your own pdf with pdftex you can set the compression
level to 0 and lo! the text magically appears inside the pdf.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From jonathan.gorman at gmail.com  Thu Jun 16 11:09:07 2005
From: jonathan.gorman at gmail.com (Jon Gorman)
Date: Thu Jun 16 11:10:02 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
	<4a6dc760506160852491a30b5@mail.gmail.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
Message-ID: <4a6dc760506161109330131f7@mail.gmail.com>

On 6/16/05, David A. Desrosiers <hacker@gnu-designs.com> wrote:
> 
> > It is possible to make the text rather obscure, but that doesn't
> > mean that if formatted correctly you could not scan through the file
> > in a text editor and read it.  Granted, it's rarely done, but
> > doesn't mean it's impossible.
> 
>         I just ran strings(1) across about 40 of the PDFs I have here
> from various clients, online resources and PDFs I've created in
> Windows and with OpenOffice.org, and not a single one contained any
> readible strings that are actually in the _content_ of the documents
> themselves, other than the strings which comprise URLs embedded in the
> document itself.
> 
>         So where is the text of the document stored? If its somewhere
> in here, why is it obfuscated by default, in every single PDF I have?
> 

In text blocks within the documents which can be encoded and are
referenced from the part of the document that sets up the layout.

>         The document content itself is most-definitely NOT stored as
> "plain text" in the pdf documents I have here, which is a pretty broad
> sample set.

People are not arguing the average case.  Like I said, it's rare for
it not to be obfuscated.  But guess what, improbable != impossible. 
You said it was impossible, that the information was stored purely as
pixels and vectors.  It's not.  There is a whole subculture that is
quite used to the idea of there being embedded text from when direct
tinkering with postscript/tex processing was more common.  You might
need a tool more complex than strings to grab the textual information
out if obsfuscated (since it can really be an encoding within an
encoding).

I'm at a loss to what your example run proved.  Just that's rare.  And
Marcello was kind enough to provide an example where it was not
obfuscated.  See those ()?  Simple definition of them (It's been a
while since I read the Reference, so this isn't 100%)  means that the
characters are not in another encoding so there is no need to convert
them when generating the page.

It's pretty well known that the great number of automatic pdf
generators can create some very unreadable code.  I knew someone who
was bitterly disappointed at the amount of cruft and difficulty it
brings to working with them.  But ideally they still follow the rules
in the Reference (it's annoying to find, but it is available through
the adobe site).  If it's not in that syntax, it's no more an pdf than
if an "almost-XML" document had elements with no closing tags.

If I had time, I'd write one by hand for ya that had none of the
encoding mess.

I'd agree most pdf documents would be a pain to handle by hand, but
you wouldn't have to apply OCR like techniques to most.  Just write a
parser based off specs.  I'm confused at the point of all of this. 
You seemed to be implying that bowerbird couldn't be doing what he
claimed because: " Acrobat doesn't store text in PDFs, they store
pixels and
vectors and OCR'd coordinates. "   Multiple people have pointed out
that this is wrong, that there is text within pdfs.  They've shown
examples.  Remember, probably most of the obstfucation code is there
for more nefarious reasons, but some of the ideas come from valid
problems with multiple character encoding sets.  (We're talking about
techniques established well before Unicode)

I'm not arguing that the format is good or bad, that we should abandon
ACII files here at gutenberg or anything along those lines.  Just that
your statement was misleading.  A pdf is not like a jpeg.  In fact, as
far as vector-based systems go I'm not familiar with any vector system
that doesn't store text in a file instead of just pure vector
representation of characters due to efficiency reasons.


Jon Gorman
From Bowerbird at aol.com  Thu Jun 16 11:48:27 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 16 11:48:39 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
Message-ID: <20f.31afa78.2fe3237b@aol.com>

jon gorman said:
>   Just wanted to clarify things.  

that's good.  i like clarification...         :+)


>   I'm skeptical about bowerbird's claims as well

that's good.  i like skeptics...          ;+)

but the proof is in the pudding,
jon, the proof is in the pudding...


>   but it's misleading to say that 
>   Acrobat doesn't store text in the document.

i believe, like you, that that would be a misleading statement.


>   It is possible to make the text rather obscure

well, as i said, one _can_ make it rather totally "obscure" by
converting it to graphic format before writing it to the .pdf.
in that case, the user cannot copy out the text -- as text --
to the clipboard.  such "text" is not found by "find" either.

(here i'm largely speaking, of course, as a _programmer_
who is actually outputting the content to the .pdf driver.
most people creating a .pdf don't have that luxury, in that
they're stuck with whatever their authoring tool might do.
as a sidebar here, i will note that the problems involved in
copying text from a .pdf are well-known and long-standing,
so they _should_ have been addressed by the programmers of
common authoring tools, like word-processors, by this time.
in programming my tool, i have sought to empower my users,
including in this arena of round-tripping text put into a .pdf.)


>   but that doesn't mean that if formatted correctly 
>   you could not scan through the file in a text editor and

>   read it.  Granted, it's rarely done, but doesn't mean 
>   it's impossible.

well, i believe your statement is misleading as well, jon...

(and if you're striving to "clarify" things, you really should try 
something to see if you _can_ do it before you _say_ you can...)

load a .pdf into an editor; you won't find much (if any) text qua text,
not in a recognizable form you can easily copy out to the clipboard.

(it's not _impossible_ you will find some text, depending upon
how the .pdf was created, since there is text in some .ps files.
but it's never a long unbroken stretch before it is interrupted
by postscript commands, so this approach is doomed to failure.)

so one shouldn't expect to find text -- stored as text -- in a .pdf,
not in the traditional sense.  (however, see the p.s. on this post.)

nonetheless, if the text wasn't stored in the .pdf in _some_ way,
users wouldn't be able to copy it out to the clipboard, would they?
and acrobat wouldn't be able to do "find" operations on it, would it?

(notably, though, you'll discover that acrobat's "find" capabilities
don't extend to whitespace.  for instance, you can't do a search for
two spaces, even if there were such instances in the original file.)

-bowerbird

p.s.  it might be possible to store text in the comments of a .pdf,
i'm not sure.  if you could, then that _might_ be interesting to do.
(i will explore the possibility, especially when my app starts to
create .pdfs directly without running them through a .pdf driver.)
with such storage, one wouldn't need to pull the .pdf into acrobat
in order to retrieve the text from it, which might be a capability
that some people would find useful.  (it would also allow ordinary
search programs to search the .pdf.)  but that's just gravy to me;
as long as users can "roundtrip" text out of a .pdf, my goal is met.
once people get used to my viewer, they won't even _want_ .pdfs.
From hacker at gnu-designs.com  Thu Jun 16 11:52:37 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Thu Jun 16 11:53:39 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <4a6dc760506161109330131f7@mail.gmail.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
	<4a6dc760506160852491a30b5@mail.gmail.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
	<4a6dc760506161109330131f7@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>


> You said it was impossible, that the information was stored purely 
> as pixels and vectors.  It's not.

	I'll let it drop... except this one point: 

	I never said it was "impossible" for a pdf to contain text in 
any of my messages (and further, I've never even used the word 
"impossible" in any message I've ever posted to this list, ever.)

	Every single pdf I have here is exactly that: 7-bit ascii 
text, and nothing more, but the text in the pdfs is definately not the 
text that comprises the content of the pdf itself. I have heard of 
binary pdfs, I don't have one here and couldn't find one out there.  
My collection includes pdfs which are heavily encrypted with the 
latest-n-greatest Adobe 7.whatever product, and they're still 100% 
ascii text, but none of the text (except urls) is document "content".

> You might need a tool more complex than strings to grab the textual 
> information out if obsfuscated (since it can really be an encoding 
> within an encoding).

	I've got many here, and even seen quite a few commercial 
(proprietary, no source available) products hijacking pdftohtml's 
source for their pdf rendering. I think I may have found yet-another 
one last night that converts PDFs for display on a Palm handheld 
device (a commercial "Office Documents on Palm" product). Of course 
the output is absolutely horrible, as is the output of most PDFs, but 
that's another matter.

> You seemed to be implying that bowerbird couldn't be doing what he 
> claimed because: " Acrobat doesn't store text in PDFs, they store 
> pixels and vectors and OCR'd coordinates. "

	Actually, no tools that can decompose PDF back to readible 
text produce anything worth using. In 100% of the cases I've found, 
which includes Open Source and commercial tools, you have to go back 
in and reformat the entire output by hand anyway. I've tried 
automating the rewrap, paragraph layout and many other aspects, and 
its just not worth it. Its easier to load it up in xpdf or acroread 
and cut and paste from the GUI into another file and format from that 
baseline.

	But back to the Bowerbird case... he contends that his Z.M.L. 
tool written in gwbasic (or whatever its using these days) can do 
everything including make coffe, walk the dog, and oh yeah, convert 
pdfs to a pleasant-to-read format. If this is true, this would be the 
first tool out of literally dozens that I've tried to accomplish this 
feat successfully.

	But I'm not going to go install DOS and gwbasic to find out. 


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From Bowerbird at aol.com  Thu Jun 16 12:49:29 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 16 12:49:45 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
Message-ID: <1f6.bdf00f6.2fe331c9@aol.com>

david said:
>   But back to the Bowerbird case... 

i was wondering when we were gonna stop wasting time
talking about frivolous topics like .pdf and usability, and
get back to the most important topic of all -- me!

so thanks for getting us back on-point, david...         ;+)


>   he contends that his Z.M.L. tool 
>   written in gwbasic (or whatever its using these days) 

realbasic.

http://www.realsoftware.com

it runs on mac (classic and o.s.x.)
and windows (95 and up) and 
even some flavors of linux.

likewise, it creates programs that
run on all those platforms as well...

and just this week, they announced a new version
-- rb2005, which is written in realbasic, so it is
a wonderful example of eating your own pudding --
and they are making the entry-level linux version
free (as in free beer).  i take it you're one of those
language snobs who wouldn't even consider basic,
but if i'm wrong, you should take a good look at it.
lots of power in it, and cross-plat that really works.

i have no linux experience, so i haven't compiled my
viewer-program out to linux yet, but if you want to
be my guinea-pig, i mean "alpha-tester", let me know.


>   can do everything including make coffee

i don't drink coffee, so there are no plans in that regard.


>   walk the dog

we have a cat.  she walks herself.
and given my gut, i should take
my own walks.  so again, no plans.

i _do_ eat, however.  and i like toasted-cheese sandwiches.
so i _do_ have plans to put a routine in my viewer-program
that will make a toasted-cheese sandwich.  you'll be able to
specify the type of bread, how light/dark you want it toasted,
and any amount of several different types of cheeses, so i am
quite excited about this.  i just wish i knew how to program it.
perhaps i'll start an open-source effort.  got any advice for me?


>   and oh yeah, convert pdfs to a pleasant-to-read format. 

no, that's not really my objective.

yes, the .pdfs that my program creates _are_ pleasant-to-read,
because they're just a .pdf version of what my viewer displays...

but the "roundtripping" goal is that when my program makes a .pdf,
the end-user can copy the text out of it, make a few global changes,
and then stick it right back into my viewer and it will look the same.
create another .pdf from that and it'll look identical to the first .pdf;
and you can again copy the text out of that, make the global changes,
and then stick it right back into my viewer and it will look the same.
no fuss, no muss, no reapplication of markup, just roundtrip usage...


>   If this is true, this would be the first tool out of literally dozens 
>   that I've tried to accomplish this feat successfully.

actually, getting the exact same text out that you put in
is not all _that_ remarkable.  or _shouldn't_ be, anyway.
but yeah, i know of no other tool that can do it either...


>   But I'm not going to go install DOS and gwbasic to find out. 

you silly boy.  i moved out of dos well over a decade ago.
and gwbasic was always quite inferior compared to qbasic.

(although, as a command-line processor, dos was a _very_
friendly interface for a power-user like myself.  my word,
i had .bat files that would interactively create .bat files!
two-letter .bat files give you 500+ quickly-run commands.
i tell ya, there were many times my efficiency could _fly_.
compared to that, a graphical-user-interface is molasses.
but hey, it's all about selling units to the masses, right?)

anyway, david, wanna alpha-test?
or would you prefer to wait for the
toasted-cheese sandwich feature?

-bowerbird
From prosfilaes at gmail.com  Thu Jun 16 14:23:53 2005
From: prosfilaes at gmail.com (David Starner)
Date: Thu Jun 16 14:24:04 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <f782b1dieuk8v2135bs7ka5fggotsdft74@4ax.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
	<6d99d1fd050615133945b30ce3@mail.gmail.com>
	<f782b1dieuk8v2135bs7ka5fggotsdft74@4ax.com>
Message-ID: <6d99d1fd05061614235190a033@mail.gmail.com>

On 6/16/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
> On Wed, 15 Jun 2005 15:39:26 -0500,  David Starner <prosfilaes@gmail.com>
> wrote:
> | Why? With decent margins, you can print letter on A4 or vice versa.
> | And we aren't really talking typesetters here; typesetters don't care
> | why size paper it is, since they're going to rip it apart and re-set
> | it anyway. We're talking about people who are dumping our preformed
> | blob to paper.
> 
> Please note the Subject of this thread:
> Print-on-demand and dead-tree copies of Gutenberg texts.

I noted it. I don't see how it changes anything.
 
> IMO felling trees is a Bad Idea, especially when with a little thought
> fewer trees could be felled.

Generous margins are always nice, and printing letter on A4 doesn't
cause more trees to be cut down; it's the same amount of text, shaped
differently.

> Inconsistent use of SI units and international
> standard paper sizes remain today a primary
> cause for U.S. businesses failing to meet
> the expectations of customers worldwide.

And use of international standard paper sizes remains today a primary
cause of international businesses failing to meet the expectations of
American customers.

It's cute how you point out that all the print-on-demand places are in
America; perhaps that means that we should use American paper sizes,
then?
From jonathan.gorman at gmail.com  Thu Jun 16 14:39:44 2005
From: jonathan.gorman at gmail.com (Jon Gorman)
Date: Thu Jun 16 14:39:54 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
	<4a6dc760506160852491a30b5@mail.gmail.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
	<4a6dc760506161109330131f7@mail.gmail.com>
	<Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>
Message-ID: <4a6dc76050616143950c11c7e@mail.gmail.com>

>         I never said it was "impossible" for a pdf to contain text in
> any of my messages (and further, I've never even used the word
> "impossible" in any message I've ever posted to this list, ever.)

And I shouldn't put words in your mouth.  I'm sorry.  I just
interpreted " Acrobat doesn't store text in PDFs" to being that the
specifications says it never stores text in pdfs, hence it would be
impossible to add.  I realized later it could also be interpreted
slightly differently (either referring to the applications that create
pdfs don't do it or because of common practice).  It is a real pain to
get the text out and getting worse which each version of pdf.

>         But back to the Bowerbird case... he contends that his Z.M.L.
> tool written in gwbasic (or whatever its using these days) can do
> everything including make coffe, walk the dog, and oh yeah, convert
> pdfs to a pleasant-to-read format. 

I must admit perhaps I wasn't following closely but I think bowerbird
just claimed the pdf that he exported was easy to import back as pdf
(via copying out the text), not necessarily that he converted an
existing pdf file.  Of course, I'm probably wrong about that.  Without
capitols I sometimes get lost ;).  Except for reading e.e. cummings I
suppose.

Again David, I'm sorry if I hurt any feelings or anything along those
lines.  I known some Palm developers so your name is familiar. 
They're happy for the help you contributed to the community so I would
be in some hot water if I ticked you off by being a little too
flippant.  For some reason certain threads on this mailing lists tend
to warp my brain I think.  Wonder if it has anything to do with the
odd nesting sensation I get when I read certain parts of gutvol-d.

Jon
From marcello at perathoner.de  Thu Jun 16 15:27:46 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Thu Jun 16 15:27:59 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>
References: <62.5703acdd.2fe13657@aol.com>	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>	<4a6dc7605061608024b004c32@mail.gmail.com>	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>	<4a6dc760506160852491a30b5@mail.gmail.com>	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>	<4a6dc760506161109330131f7@mail.gmail.com>
	<Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>
Message-ID: <42B1FCE2.5020405@perathoner.de>

David A. Desrosiers wrote:

> Every single pdf I have here is exactly that: 7-bit ascii text, and 
> nothing more,

The encoding used in a pdf depends of the font technology: Type-1, 
Type-3, TrueType etc. You can link a dictionary to every font and thus 
change the standard encoding in any way you like. pdf can even 
accomodate multi-byte encodings.


-- 
Marcello Perathoner
webmaster@gutenberg.org

From donovan at abs.net  Thu Jun 16 15:42:17 2005
From: donovan at abs.net (D Garcia)
Date: Thu Jun 16 15:39:53 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <42B1BFF2.1040905@perathoner.de>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
	<42B1BFF2.1040905@perathoner.de>
Message-ID: <200506161842.17244.donovan@abs.net>

On Thursday 16 June 2005 02:07 pm, Marcello Perathoner wrote:
> David A. Desrosiers wrote:
> A pdf is a chunked file format and each chunk can be compressed or even
> encrypted. A run-of-the-mill pdf is always at least compressed.
>
> If you create your own pdf with pdftex you can set the compression
> level to 0 and lo! the text magically appears inside the pdf.

And if you're truly insane (and or interested) in the format, you can obtain 
the specs and learn how to write a PDF by hand in a standard text editor.
(Which, yes, I have done, including writing vector graphics.) If you 
understand the technique, you can even write simple scripts in (your 
interpreted language of choice) to output simple PDF files directly, which is 
great for doing things like cgi report generation without library 
dependencies and the like.

iirc, the most commonly used compression in PDF is FLATE, which is relatively 
trivial and fast/good enough for the majority of cases.
From Bowerbird at aol.com  Thu Jun 16 15:52:10 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 16 15:52:27 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
Message-ID: <55.75704963.2fe35c9a@aol.com>

jon gorman said:
>   I think bowerbird just claimed the pdf that he exported 
>   was easy to import back as pdf (via copying out the text), 
>   not necessarily that he converted an existing pdf file.  
>   Of course, I'm probably wrong about that.  
>   Without capitols I sometimes get lost ;)

i understand...           :+)

and what you've said is pretty-much correct.

yes, i'm _only_ talking about .pdfs that _my_ viewer-app creates.
(if some other program created the .pdf, then blame that program.)

so, you put plain-text into my viewer, and it formats it nicely.
you can print that nice formatting to a .pdf (which looks nice).

and then you can copy the text out of the .pdf.  when you do that,
much of the nice formatting has been stripped away, of course,
and we're back to plain-text again.  (if i remember correctly,
.pdf _does_ retain italicizing, but it _doesn't_ retain bolding.
i don't have the faintest idea why, it's kinda weird like that.
and it definitely stores the color of the text, which is cute.
but it definitely strips the _size_ of the text, which is bad.
all of this is in _my_ version of acrobat reader, which is v4.
we talk about acrobat/.pdf like it's one straightforward thing,
but it's a crazy mish-mash of different-and-changing versions,
so all of our discussion needs to be couched in careful clauses.)

but the loss of formatting doesn't matter, because after you
have made a few global changes (which, among other things,
restore the blank lines between paragraphs that get stripped),
you can put the text back into my viewer-program, and it will
redo the nice formatting, just like it did it in the first place...

with zen markup, this is all pretty easy to accomplish...      :+)


>   If I had time, I'd write one by hand for ya 
>   that had none of the encoding mess.

i'd love to see that!

-bowerbird
From hacker at gnu-designs.com  Fri Jun 17 02:46:46 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Fri Jun 17 02:48:02 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <4a6dc76050616143950c11c7e@mail.gmail.com>
References: <62.5703acdd.2fe13657@aol.com>
	<Pine.LNX.4.62.0506160903520.15329@angst.gnu-designs.com>
	<4a6dc7605061608024b004c32@mail.gmail.com>
	<Pine.LNX.4.62.0506161112210.17948@angst.gnu-designs.com>
	<4a6dc760506160852491a30b5@mail.gmail.com>
	<Pine.LNX.4.62.0506161338070.3833@angst.gnu-designs.com>
	<4a6dc760506161109330131f7@mail.gmail.com>
	<Pine.LNX.4.62.0506161442510.4182@angst.gnu-designs.com>
	<4a6dc76050616143950c11c7e@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0506170545370.13001@angst.gnu-designs.com>


> Again David, I'm sorry if I hurt any feelings or anything along 
> those lines.  I known some Palm developers so your name is familiar. 
> They're happy for the help you contributed to the community so I 
> would be in some hot water if I ticked you off by being a little too 
> flippant.

	I have very thick skin, it takes a lot to hurt my feelings ;)

	No harm, no foul. Your comments (and those of others) were 
informative and worthwhile.

> Wonder if it has anything to do with the odd nesting sensation I get 
> when I read certain parts of gutvol-d.

	I know... let's blame Bowerbird! ;) j/k 


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From marcello at perathoner.de  Fri Jun 17 03:15:16 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Fri Jun 17 03:15:40 2005
Subject: [gutvol-d] roundtripping formatted text through a .pdf
In-Reply-To: <55.75704963.2fe35c9a@aol.com>
References: <55.75704963.2fe35c9a@aol.com>
Message-ID: <42B2A2B4.1020402@perathoner.de>

Bowerbird@aol.com wrote:

> but the loss of formatting doesn't matter, because after you
> have made a few global changes (which, among other things,
> restore the blank lines between paragraphs that get stripped),
> you can put the text back into my viewer-program, and it will
> redo the nice formatting, just like it did it in the first place...

It's no round-tripping if you have to hand-tweak the files.

Before I'd have to re-apply by hand all things your program fumbled 
along the way, I'd "round-trip" the pdf thru images and Abbyy 
Finereader. (That works for *any* pdf.)

What use is this feature anyway, if you just `round-trip' pdfs produced 
by your program? Then why not keep the zml file around? If you could 
convert *all* pdf files into zml, that would be something.

Or did you just learn a new buzz-word: "round-trip", and are milking it 
for what its worth?


-- 
Marcello Perathoner
webmaster@gutenberg.org

From hyphen at hyphenologist.co.uk  Fri Jun 17 03:34:46 2005
From: hyphen at hyphenologist.co.uk (Dave Fawthrop)
Date: Fri Jun 17 03:35:34 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of Gutenberg
	texts.
In-Reply-To: <6d99d1fd05061614235190a033@mail.gmail.com>
References: <26b51c32050615132219efb74e@mail.gmail.com>
	<8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
	<6d99d1fd050615133945b30ce3@mail.gmail.com>
	<f782b1dieuk8v2135bs7ka5fggotsdft74@4ax.com>
	<6d99d1fd05061614235190a033@mail.gmail.com>
Message-ID: <j295b1lfvt5fqptrgupoao88lcqqbmi7qd@4ax.com>

On Thu, 16 Jun 2005 16:23:53 -0500,  David Starner <prosfilaes@gmail.com>
wrote:


| > Inconsistent use of SI units and international
| > standard paper sizes remain today a primary
| > cause for U.S. businesses failing to meet
| > the expectations of customers worldwide.
| 
| And use of international standard paper sizes remains today a primary
| cause of international businesses failing to meet the expectations of
| American customers.

What is it about *international* which you do not understand.
 
| It's cute how you point out that all the print-on-demand places are in
| America; perhaps that means that we should use American paper sizes,
| then?

Now that is a strange attitude in Project Gutenberg which is named after a
person who lived in Mainz, which is now part of Germany and was at the time
part of Europe.
http://www.greatsite.com/timeline-english-bible-history/gutenberg.html
Incidentally he died in 1468, and the Pilgrim fathers sailed from Plymouth,
Devon, England, in the Mayflower on 16 September 1620, some 150 years after
Gutenberg died.

-- 
Dave Fawthrop <dave hyphenologist co uk> http://www.webshots.com 
Thousands of wonderful professional photos for your Wallpaper and 
Screensaver. also 200,000 amateur pics. Four new pics each day.

From nwolcott at dsdial.net  Thu Jun 16 20:47:15 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Fri Jun 17 07:57:33 2005
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg
	texts.
References: <26b51c32050614142824831367@mail.gmail.com>
Message-ID: <00cf01c5734c$c2891680$049495ce@gw98>

Bowerbird's comments are very appropriate. If you have a special book you
wish to republish, then you can afford the time and effort to get it ready
for POD. I have published 2 books this way, by Jules Verne, which I did as
an experiment in republishing a 100 year old book with illustrations. You
can see it at WWW.LULU.COM, search for Verne as you will see "The Blockade
Runners". Lulu is the only way to go if you do not want to pay up front
charges, all other POD publishers require $500 up front. The disadvantage
with Lulu is that you have to be prepared to get the book ready for press.
There are a lot of things necessary to do this  -- just take page numbers
for example, getting them on the right place on the page (different for
right and left maybe) running headers, page breaks or not for chapters,
illustrations, cover design, back cover design, blurb for cover insert, art
work for cover, footnotes properly numbered and placed on the page, choice
of fonts, you may need type 1 fonts for a good appearance,  you may need
Adobe or Quark to do a half way presentable job for your book. It took me
about 6 weeks to get my books (they are partly identical) ready for Lulu.
They came out very well, and even selling them at cost comes out at $6 and
then there is uncle sam's $2 minimum for postage so none have been sold.
Admittedly this is not a barn burner of a book, but as a dual language text
it would be very useful as the French is quite elementary.

I will probably do another book or two, I would like to get better pictures.
The POD presses use 600 dpi lasers, 1200 dpi lasers are available and are a
must for decent half tone pictures.(Letterpress uses 400 lpi plus).
Unfortunately Lulu does not yet have them.

Good luck on your first project!

----- Original Message -----
From: "grendelkhan" <grendelkhan@gmail.com>
To: <gutvol-d@lists.pglaf.org>
Sent: Tuesday, June 14, 2005 5:28 PM
Subject: [gutvol-d] Print-on-demand and dead-tree copies of Gutenberg texts.


> I was having a discussion with my father, and I thought I would bring
> it up on the mailing list, as it seems to be the place for it.
>
> We'd just come out of our local Wal-Mart, and I'd noticed the
> out-of-copyright books (classics and such) being sold for $6 to $11
> each. I commented that folks could just download the books for free if
> they wanted to read them, but he asked how many people owned a
> computer, and how many of those had heard of Project Gutenberg?
>
> So I did a bit of researching, and discovered that there exist "print
> on demand" publishers, which instead of doing the offset-printing runs
> of thousands and thousands of books, will, once a book has been
> prepared and typeset, sometimes keep none at all in stock, and print
> them only when ordered.
>
> It seems that it would be a good idea to come up with some way to
> offer the majority of PG's catalog through some method of
> print-on-demand publishing, selling at-cost. Many Gutenberg works are
> obscure, and not of general enough interest to warrant a print run
> from a traditional publisher.
>
> I'm aware that I could clearly run off and do this myself, but (a) I
> wanted to get some feedback from the community at large, and (b)
> print-on-demand publishing still requires start-up costs, and a
> per-book "setup" fee of some kind, above and beyond the per-copy
> materials cost. Given that PG has the Distributed Proofreaders to
> provide lots and lots of work on worthy projects, and given that PGLAF
> is a charitable organization which lots of people love, is there some
> way to get around that issue?
>
> Would it be worth it to provide a source of dead-tree editions of many
> of the archive's works? Thoughts? Objections? Pointers to some guy
> who's been doing this for the last ten years that I failed to Google
> up?
>
> --grendelkhan
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>


From nwolcott at dsdial.net  Fri Jun 17 07:28:41 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Fri Jun 17 07:57:37 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of
	Gutenbergtexts.
References: <26b51c32050615132219efb74e@mail.gmail.com><8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com>
	<26b51c3205061514095c5e8f8@mail.gmail.com>
Message-ID: <00d101c5734c$c3c05e00$049495ce@gw98>

If you google "Print on Demand" or "pod publishing" you will find that there
are several surveys which review all the POD publishers and list their
costs, minimums, etc. These are not exactly up to date but will give you a
good basis for comparison. Many charge you for making the cover and for a
book jacket too. don't forget those if you are going hardback. Also be aware
of limits on book size. Cafe has an unrealistically low limit which preludes
illustrations. But other than Lulu and Cafe you are dealing with the vanity
press market where you are paying up front for "marketing" and whatever else
that entails.

----- Original Message -----
From: "grendelkhan" <grendelkhan@gmail.com>
To: <gutvol-d@lists.pglaf.org>
Sent: Wednesday, June 15, 2005 5:09 PM
Subject: Re: [gutvol-d] Re: Print-on-demand and dead-tree copies of
Gutenbergtexts.


> On 6/15/05, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote:
> > Just a mention that all Europe uses A4 paper.
> > Anything designed solely for American paper sizes will be useless to
> > typesetters in Europe.
>
> I was planning on 6"x9", which I think is the standard trade paperback
> size. Except... hmm.
>
> http://www.cafepress.com/cp/info/help/learn_book_info.aspx
>
> Cafe Press states that 4.18in x 6.88in is the standard 'Mass Market
> Paperback' size. Also 5in x 8in for 'Standard Paperback'.
>
> http://www.whitehallprinting.com/TrimSize.html
>
> Some random printing company lists 6x9 and 5.5x8.5 as 'Standard Trim
Sizes'.
>
> http://www.powerhomebiz.com/vol93/selfpublishing2.htm
>
> Another random tutorial lists 6x9 and 5-3/8x8.
>
> http://www.josephzitt.com/books/smwb-howto.php#pod
>
> Says here that apparently the 6x9 format is standard, at least with
> Lightning Source.
>
> Ah, and Cafe Press offers printing for $7 plus $0.03 per page with no
> setup fees. So, probably not the cheapest option. Perhaps I'll prep
> something and approach Lightning Source asking what they need in the
> way of preparation supplies---that is, what can be done for them.
>
> Is 6x9 a standard paperback size in Europe? I suppose that's of less
> interest. I'll be measuring some of my paperbacks at home this evening
> once I get back from work. Maybe print and trim a few test pages or
> something.
>
> --grendelkhan
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>


From nwolcott at dsdial.net  Fri Jun 17 07:22:33 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Fri Jun 17 07:57:44 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of
	Gutenbergtexts.
References: <26b51c32050615132219efb74e@mail.gmail.com>
Message-ID: <00d001c5734c$c33a1700$049495ce@gw98>

The best way to see what is involved in publishing POD texts is to actually
do that on Lulu.

    I have no interest financially one wayor another in Lulu, but they do
offer a service: for the upfront per item charge of $4 and .02 per page
(cheaper than xerox) they keep your book available on their hard drive in
perpeptuity, or until they go out of business. This $4 charge is buried in
the $500 up front charge by other POD publishers.

    Please note that Lulu is not doing the publishing, they are just
providing a needed service between the producer of the book and Ingram,
Lightspeed, and other POD sites which print thousands of texts per day.
These biggies are not interested in answering your phone call, and their
product is marketed through publishing channels for $20 to $30 per paperback
copy, something Lulu provides for $6.

Not that Lulu, as any small company in a niche market, has not had some
problems. But these are largely faced up front on their message boards and
addressed conscientiously by management.

Taking a book through Lulu involves going through 6 steps that are the bare
minimum for a publishing process. I encourage those who are interested in
POD to actually get their feet wet and produce a book. Have you thought
about cover art? Are you a professional illustrator? Can you afford an
artist?

And if as I found no one will buy a Lulu book even at their cost of $6, then
even lowering the price to $1 would not produce any sales in this marketing
oriented world.

And do not forget there are massmarket publishers of pd books at $3 to $5
such as the Wordsworth Classics, you just do not see them in bookstores and
must special order them. Also due to marketing processes they are not
handled by book distributors but by newsvendors, which makes them even more
difficult to obtain. And , as often the case in PG, the source of the 1800
version is not noted but is left to booksleuths to determine, it is not
unusual that "sales" are low.

Case in point: Journey to the Centre of the Earth, available on PG as a
Journey to the Interior of the Earth, tr by Frederick A. Malleson, a fairly
complete and literary Victorian translation, $3.95 special order at Barnes
and Noble.
----- Original Message -----
From: "grendelkhan" <grendelkhan@gmail.com>
To: <gutvol-d@lists.pglaf.org>
Sent: Wednesday, June 15, 2005 4:22 PM
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of
Gutenbergtexts.


> Thanks to everyone for their comments so far! I'm learning quite a bit as
I go.
>
> As of October 2003, the US Commerce department reported that about
> three-fifths of households had a computer; a little over half had
> internet access.
>
> https://www.esa.doc.gov/Reports/NationOnlineBroadband04.htm
>
> So it's not as bad as I was led to believe. Still,
>
> Perhaps I should have stated my goals a little more clearly. I have no
> particular interest in making money or making a business out of this.
> I'd simply like to make the books available---through whatever means
> that may be---in dead-tree form. I suppose it's a terrible idea fo tie
> the actual Project to a commercial entity by developing a working
> relationship with them---I don't think an "Official Project Gutenberg
> Edition" is a good idea.
>
> lulu.com, as mentioned, has no setup fees, but their pricing is a mite
> stiff---$4.53 plus $0.02/page. Certainly better than buying stuff from
> most university presses, but not exactly bargain-basement. Lightning
> Source charges (based on some quick googling at
>
http://com1.runboard.com/bthescribesmessageboard.fwritingarchives.t45%7Coffs
et=15
> ), $0.90 plus $0.013 per page, but I don't know what kind of binding
> that requires, or what sort of setup fees they charge. Perhaps they'd
> waive them if DP put out some sort of print-ready version in addition
> to human-readable text. I'm thinking TeX->PDF here, as it's pretty
> much the stablest human-readable-yet-fully-marked-up format available.
> Thoughts? I suppose I should take a relatively short etext, mark it up
> and see how it looks.
>
> I concur that simply throwing plain text, or even decent HTML, at
> paper is a horrible idea. So, what I ask is---is there a way to
> prepare the etexts as, in addition to HTML, whatever format is
> print-ready for these machines? Since typesetting a ready copy is a
> simple matter of feeding it to a Xerox DocuTech or whatever the
> $100,000 piece of hardware the print shop uses is, how can we do the
> necessary preprocessing ourselves? What exactly does the "setup fee"
> include?
>
> Thanks to everyone again for being so helpful with this.
>
> --grendelkhan
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>


From nwolcott at dsdial.net  Fri Jun 17 08:01:20 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Fri Jun 17 10:51:55 2005
Subject: [gutvol-d] Re: Print-on-demand and dead-tree copies of
	Gutenbergtexts.
References: <26b51c32050615132219efb74e@mail.gmail.com><8v31b15dh04t720s70696hg7ip3n8djklk@4ax.com><6d99d1fd050615133945b30ce3@mail.gmail.com><f782b1dieuk8v2135bs7ka5fggotsdft74@4ax.com><6d99d1fd05061614235190a033@mail.gmail.com>
	<j295b1lfvt5fqptrgupoao88lcqqbmi7qd@4ax.com>
Message-ID: <000201c57365$207ed000$0b9495ce@gw98>

If I'm not mistaken Lightspeed and some of the biggest POD publishers are UK
based.
----- Original Message -----
From: "Dave Fawthrop" <hyphen@hyphenologist.co.uk>
To: "David Starner" <prosfilaes@gmail.com>; "Project Gutenberg Volunteer
Discussion" <gutvol-d@lists.pglaf.org>
Sent: Friday, June 17, 2005 6:34 AM
Subject: Re: [gutvol-d] Re: Print-on-demand and dead-tree copies of
Gutenbergtexts.


> On Thu, 16 Jun 2005 16:23:53 -0500,  David Starner <prosfilaes@gmail.com>
> wrote:
>
>
> | > Inconsistent use of SI units and international
> | > standard paper sizes remain today a primary
> | > cause for U.S. businesses failing to meet
> | > the expectations of customers worldwide.
> |
> | And use of international standard paper sizes remains today a primary
> | cause of international businesses failing to meet the expectations of
> | American customers.
>
> What is it about *international* which you do not understand.
>
> | It's cute how you point out that all the print-on-demand places are in
> | America; perhaps that means that we should use American paper sizes,
> | then?
>
> Now that is a strange attitude in Project Gutenberg which is named after a
> person who lived in Mainz, which is now part of Germany and was at the
time
> part of Europe.
> http://www.greatsite.com/timeline-english-bible-history/gutenberg.html
> Incidentally he died in 1468, and the Pilgrim fathers sailed from
Plymouth,
> Devon, England, in the Mayflower on 16 September 1620, some 150 years
after
> Gutenberg died.
>
> --
> Dave Fawthrop <dave hyphenologist co uk> http://www.webshots.com
> Thousands of wonderful professional photos for your Wallpaper and
> Screensaver. also 200,000 amateur pics. Four new pics each day.
>
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> http://lists.pglaf.org/listinfo.cgi/gutvol-d
>
>


From grythumn at gmail.com  Sun Jun 19 14:06:27 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Sun Jun 19 14:06:37 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
Message-ID: <15cfa2a505061914061d3cfda9@mail.gmail.com>

I've been going through my files trying to close out some partially
finished projects. I have several Beatrix Potter books that had
missing or damaged pages, and I went to the library today to try to
fill in the blanks.

Unfortunately, they only had the newer editions with a modern
copyright, claiming a copyright because they had made a new transfer
of the old watercolors. As far as I understand copyright law, this
claim is bogus; a derivative work must be different enough from the
original to be considered a new work; a slight technical improvement
on the reproduction is not enough. An original lithograph, sure, but
not making new screens. This may or may not be complicated by the fact
that the publisher operates both out of the UK and the US.

Is my understanding correct enough to go through with an official
clearance request? Or shall I hunt for older copies? Potter books are
not rare, but finding the older ones is more difficult.

Thanks,
R C
From collin at xs4all.nl  Sun Jun 19 15:04:14 2005
From: collin at xs4all.nl (Branko Collin)
Date: Sun Jun 19 14:50:28 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
In-Reply-To: <15cfa2a505061914061d3cfda9@mail.gmail.com>
Message-ID: <42B607FE.401.4C8C51@localhost>


On 19 Jun 2005, at 17:06, Robert Cicconetti wrote:

> Is my understanding correct enough to go through with an official
> clearance request? Or shall I hunt for older copies? Potter books are
> not rare, but finding the older ones is more difficult.

In this case I would say you even have a duty to your readers to use 
the newer reproductions. :-)

The deciding court case in the US is Bridgeman v. Corel. A lot has 
been written about it on the web. The court's decision hinged on the 
concept of originality, IIRC. The idea being that the reproduction 
was made in such a way as to convey the intent of the original author 
as good as possible.

So yes, I would send this in for clearance. The reason why PG might 
reject it is if you cannot show that these are indeed mere 
reproductions.

-- 
branko collin
collin@xs4all.nl
From shimmin at uiuc.edu  Sun Jun 19 17:34:16 2005
From: shimmin at uiuc.edu (shimmin@uiuc.edu)
Date: Sun Jun 19 17:34:40 2005
Subject: [gutvol-d] Derivative works, or, what
 is copyrightable?
Message-ID: <70a85e35.f1d820e.8198d00@expms5.cites.uiuc.edu>

As another poster pointed out, in the U.S., Bridgeman v. Corel
says that some mechanical reproductions are not "original
works" for the purpose of copyrightability; indeed, the point
of creating these works is to be unoriginal.  Whether PGLAF
wants to stand on Bridgeman in this particular case is their
own decision; as always, the only sure test as to whether
something is clearable is to try and clear it.

That said, if it's not the illustrations you're interested in,
but merely need to consult another edition to repair lacunae
in the text you're dealing with, then just consult whatever
editions you have easily at hand, and repair the text accordingly.
From grythumn at gmail.com  Sun Jun 19 18:43:08 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Sun Jun 19 18:43:23 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
In-Reply-To: <70a85e35.f1d820e.8198d00@expms5.cites.uiuc.edu>
References: <70a85e35.f1d820e.8198d00@expms5.cites.uiuc.edu>
Message-ID: <15cfa2a505061918434a7da1a1@mail.gmail.com>

On 6/19/05, shimmin@uiuc.edu <shimmin@uiuc.edu> wrote:
> As another poster pointed out, in the U.S., Bridgeman v. Corel
> says that some mechanical reproductions are not "original
> works" for the purpose of copyrightability; indeed, the point
> of creating these works is to be unoriginal.  Whether PGLAF
> wants to stand on Bridgeman in this particular case is their
> own decision; as always, the only sure test as to whether
> something is clearable is to try and clear it.

Okay. I figured it'd be "Less Work For Greg" if I asked the list in
general first. :)

I'll fill out the clearances tonight. To be honest, the differences
are fairly small; they've cleared out some of the screening artifacts
and the colors are a little more vivid; how much of that is because
they are less than 20 years old I cannot say. :)
 
> That said, if it's not the illustrations you're interested in,
> but merely need to consult another edition to repair lacunae
> in the text you're dealing with, then just consult whatever
> editions you have easily at hand, and repair the text accordingly.

Unfortunately, I need both the images and the text.

We're fairly close to having a complete set; once I finish up the
extant books (One will have to be DP-EU only; it's from 1930) I plan
to go back and produce some cleaner scans for my first few books and
possibly those from the other PMs (assuming I get permission; I don't
want to step on toes.)

R C
From gbnewby at pglaf.org  Sun Jun 19 19:06:14 2005
From: gbnewby at pglaf.org (Greg Newby)
Date: Sun Jun 19 19:06:15 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
In-Reply-To: <15cfa2a505061918434a7da1a1@mail.gmail.com>
References: <70a85e35.f1d820e.8198d00@expms5.cites.uiuc.edu>
	<15cfa2a505061918434a7da1a1@mail.gmail.com>
Message-ID: <20050620020614.GA24974@pglaf.org>

On Sun, Jun 19, 2005 at 09:43:08PM -0400, Robert Cicconetti wrote:
> On 6/19/05, shimmin@uiuc.edu <shimmin@uiuc.edu> wrote:
> > As another poster pointed out, in the U.S., Bridgeman v. Corel
> > says that some mechanical reproductions are not "original
> > works" for the purpose of copyrightability; indeed, the point
> > of creating these works is to be unoriginal.  Whether PGLAF
> > wants to stand on Bridgeman in this particular case is their
> > own decision; as always, the only sure test as to whether
> > something is clearable is to try and clear it.
> 
> Okay. I figured it'd be "Less Work For Greg" if I asked the list in
> general first. :)

(Yes, it was a good discussion!)

> I'll fill out the clearances tonight. To be honest, the differences
> are fairly small; they've cleared out some of the screening artifacts
> and the colors are a little more vivid; how much of that is because
> they are less than 20 years old I cannot say. :)

As people have said: doing such updates does not qualify
for a new copyright, in our view.

> > That said, if it's not the illustrations you're interested in,
> > but merely need to consult another edition to repair lacunae
> > in the text you're dealing with, then just consult whatever
> > editions you have easily at hand, and repair the text accordingly.
> 
> Unfortunately, I need both the images and the text.
> 
> We're fairly close to having a complete set; once I finish up the
> extant books (One will have to be DP-EU only; it's from 1930) I plan
> to go back and produce some cleaner scans for my first few books and
> possibly those from the other PMs (assuming I get permission; I don't
> want to step on toes.)

I know there were some issues with some Potter illustrations coming
later than 1923, but as long as we can clear the images they're fine
to include.

For a reminder, here's our policy that relates (at least peripherally)
to the issue of cleaned up images.  Thanks!  Greg


PROJECT GUTENBERG'S POSITION ON "SWEAT OF THE BROW" COPYRIGHT CLAIMS

Work performed on a public domain item, known as sweat of the brow,
does not result in a new copyright.  This is the judgment of Project
Gutenberg's copyright lawyers, and is founded in a study of case law
in the United States.  This is founded in the notion of authorship,
which is a prerequisite for a new copyright.  Non-authorship
activities do not create a new copyright.

Some organizations erroneously claim a new copyright when they add
value to a public domain item, such as to an old printed book.  But
despite the difficulty of the work involved, none of these activities
result in new copyright protection when performed on a public domain
item:

   - scanning and optical character recognition (OCR)

   - proofreading and OCR error correction

   - fixing spelling and typography, including substantial updates to
spelling such as changing from American to British

   - adding markup (HTML, XML, TeX, etc.)

   - digitizing, cropping, color-adjusting or other modifications
to images

   - addition of trivial new content, such as images to indicate
page breaks in an HTML file, or pictures of gothic letters for the
first letter in a chapter, or adding or removing a few words per
chapter.

   - substantial reorganization, such as moving footnotes to end-notes,
or changing the locations of pictures within the text

   - recoding to new character sets, such as Unicode, or new formats,
such as PDF


There is some value-added content that DOES get a new copyright, but
only for the actual new work (that is, it may be possible to remove
the new copyrighted content to go back to a public domain document):

   - translation into another human language

   - creating a new compilation of existing materials (though the
individual items compiled retain their public domain status)

   - creating new original art work

   - creating an original derivative work, such as an audio
performance, a new chapter, or a set of favorite quotations

   - adding a new introduction or critical essay

Project Gutenberg is able to utilize any material which is judged to
be public domain in the country of use (i.e., the United States).  If
it is determined that components of a digital item are public domain,
but others are not, then the copyrighted components may be removed
without the permission of whoever owns the copyright for the new
content.

It is Project Gutenberg's practice to seek permission of copyright
claimants before harvesting their materials.  This is done in order to
be polite, and to allow the producer or distributor to request a
particular credit be used.  But if permission is not given, public
domain items can still be used by Project Gutenberg, typically without
any attribution.  Because Project Gutenberg receives submissions
from many different sources, it is not always clear where an item came
from.  Volunteers who submit content they did not themselves generate
should be diligent about reporting sources, even if the source will
not be credited in the item as distributed by Project Gutenberg.

Most recently updated April 6, 2004


From grythumn at gmail.com  Sun Jun 19 20:33:24 2005
From: grythumn at gmail.com (Robert Cicconetti)
Date: Sun Jun 19 20:33:41 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
In-Reply-To: <20050620020614.GA24974@pglaf.org>
References: <70a85e35.f1d820e.8198d00@expms5.cites.uiuc.edu>
	<15cfa2a505061918434a7da1a1@mail.gmail.com>
	<20050620020614.GA24974@pglaf.org>
Message-ID: <15cfa2a505061920334e493726@mail.gmail.com>

On 6/19/05, Greg Newby <gbnewby@pglaf.org> wrote:
> On Sun, Jun 19, 2005 at 09:43:08PM -0400, Robert Cicconetti wrote:
> > Okay. I figured it'd be "Less Work For Greg" if I asked the list in
> > general first. :)
> 
> (Yes, it was a good discussion!)

Doing my bit to improve the signal to noise ratio. :)
 
> > I'll fill out the clearances tonight. To be honest, the differences
> > are fairly small; they've cleared out some of the screening artifacts
> > and the colors are a little more vivid; how much of that is because
> > they are less than 20 years old I cannot say. :)
> 
> As people have said: doing such updates does not qualify
> for a new copyright, in our view.

Great!

> > We're fairly close to having a complete set; once I finish up the
> > extant books (One will have to be DP-EU only; it's from 1930) I plan
> > to go back and produce some cleaner scans for my first few books and
> > possibly those from the other PMs (assuming I get permission; I don't
> > want to step on toes.)
> 
> I know there were some issues with some Potter illustrations coming
> later than 1923, but as long as we can clear the images they're fine
> to include.

The problem is that the entire book was not published until 1930;
apparently most of the images were created in 1906, but the work was
not completed and published for a long time. It was renewed (R206616),
so it is copyrighted here for a while, and in Life+70 until after
2013. However, it is clearable under Life+50 copyright law. (The Tale
of Little Pig Robinson.)

Aside from that, only The Story of the Fierce Bad Rabbit is unscanned,
and I shall correct that now that I can use the newer edition. The
other books are in various states of completion; most are waiting on
missing pages. I have spent a fair amount of time and effort to get
the images looking right; these are more visual works than written
ones.

R C
From joshua at hutchinson.net  Tue Jun 21 10:43:24 2005
From: joshua at hutchinson.net (Joshua Hutchinson)
Date: Tue Jun 21 10:43:31 2005
Subject: [gutvol-d] Baha'i Faith texts - Terms of Use acceptable for us?
Message-ID: <20050621174324.B57D09E9E5@ws6-2.us4.outblaze.com>

The Baha'i Faith makes available quite a bit of material in eBook form available on its website.  Further, the Terms of Use (http://reference.bahai.org/en/terms.html) seem to make it perfectly acceptable to further distribute this work as long as the copyright and attribution is intact and it is for non-commercial use.

Does anyone see a problem with "raiding" their material for inclusion in PG?  I realize the texts will need to be reformatted to our standards and formats, but I can do that.

Plus, as a side-benefit, they have many of the texts in Persian and Arabic as well as English, so we would be getting multiple languages represented in one swoop.

Josh

PS FYI, I am a Baha'i, but I don't speak in any official capacity.  I'm going by the Terms of Use linked above.
From Bowerbird at aol.com  Tue Jun 21 12:16:21 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Jun 21 12:16:37 2005
Subject: [gutvol-d] header detection revisited
Message-ID: <1f9.c2d11b6.2fe9c185@aol.com>

it's been one month since my post about "detecting headers",
in response to jon noring's "challenge" in that specific regard.

in case you've forgotten, here's a quick recap:

     big and bold.  that's what headers look like.
     conspicuous.  real hard to miss.  easy to find.

as i said last month, i have developed a 30-item checklist.
that's how many ways a header can make itself conspicuous.
but the main way -- by far -- is simply to be big and/or bold.

so it's time now for part 2.

but first, any questions?  don't be shy, step right up, because
headers are the first step toward detecting all types of things.
(which is why we need to discuss them in some more detail.)

-bowerbird
From jon at noring.name  Tue Jun 21 13:47:50 2005
From: jon at noring.name (Jon Noring)
Date: Tue Jun 21 13:48:06 2005
Subject: [gutvol-d] header detection revisited
In-Reply-To: <1f9.c2d11b6.2fe9c185@aol.com>
References: <1f9.c2d11b6.2fe9c185@aol.com>
Message-ID: <1311771965.20050621144750@noring.name>

Bowerbird wrote:

> It's been one month since my post about "detecting headers",
> in response to jon noring's "challenge" in that specific regard.
>
> in case you've forgotten, here's a quick recap:
>
>      big and bold.  that's what headers look like.
>      conspicuous.  real hard to miss.  easy to find.
>
> as i said last month, i have developed a 30-item checklist.
> that's how many ways a header can make itself conspicuous.
> but the main way -- by far -- is simply to be big and/or bold.
>
> so it's time now for part 2.
>
> but first, any questions?  don't be shy, step right up, because
> headers are the first step toward detecting all types of things.
> (which is why we need to discuss them in some more detail.)

There's enough variation in how headers can be formatted in print, as
well as some other structures which look like headers but are not,
that it is not possible to auto-determine with 100% reliability that
something is a header. There are also language/country/time-era
differences as well which further confuse matters.

And even if one is able to correctly auto-determine that something is
a header, there are sometimes difficulties in autodetecting the header
level, which is usually important.

It is simply not yet possible to reliably auto-determine the structure
of books and documents. This is the big problem with PDF-to-whatever
converters, since (unstructured) PDF does not preserve structural
information -- it simply lays out the content according to visual
typesetting conventions (which, of course, vary by country, language,
time era, and the whims of the author/publisher.)

Now, if the goal is to try to auto-determine a document's structure
knowing that it won't always get it right, as part of a human proofing
process (e.g., Distributed Proofreaders), then that is another matter.
But it is hard to read from Bowerbird's comments as to whether he
intends his methodology and tools to be part of a human proofing
process, or to replace it entirely. I think he will find more
acceptance of his methodology and tools by making clear the former.

Jon Noring

From Bowerbird at aol.com  Tue Jun 21 14:27:46 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Tue Jun 21 14:28:03 2005
Subject: [gutvol-d] header detection revisited
Message-ID: <42.6ba397c5.2fe9e052@aol.com>

as i step you through example after example after example -- 
all of 'em pre-existing, including many from the library itself,
and handled with solid honest-to-goodness working source-code
-- you'll come to realize fully how _easy_ it is to detect headers
(and even header-level!), and the people who insist on telling you
"it's impossible" will seem curiously illogical and out-of-touch...

oh yeah, feel free to recommend any e-text from the whole library
as a "test-case" that you would like me to consider in detail!      :+)
(to be fair, test-cases should include scans to resolve any doubts.)

thank you!  enjoy your first full day of summer!  90 degrees in l.a.!

-bowerbird
From lee at novomail.net  Wed Jun 22 15:42:12 2005
From: lee at novomail.net (Lee Passey)
Date: Wed Jun 22 15:42:31 2005
Subject: [gutvol-d] header detection revisited
In-Reply-To: <20050622190003.C66AD8C837@pglaf.org>
References: <20050622190003.C66AD8C837@pglaf.org>
Message-ID: <42B9E944.2020900@novomail.net>


>it's been one month since my post about "detecting headers",
>in response to jon noring's "challenge" in that specific regard.
>
>in case you've forgotten, here's a quick recap:
>
>     big and bold.  that's what headers look like.
>     conspicuous.  real hard to miss.  easy to find.
>
>as i said last month, i have developed a 30-item checklist.
>that's how many ways a header can make itself conspicuous.
>but the main way -- by far -- is simply to be big and/or bold.
>
>so it's time now for part 2.
>
>but first, any questions?  don't be shy, step right up, because
>headers are the first step toward detecting all types of things.
>(which is why we need to discuss them in some more detail.)
>
>-bowerbird
>  
>

The question is a bit ambiguous. What are you trying to detect headers 
_from_? AFAICT, Gutenberg e-texts don't have big and don't have bold, so 
neither can be the hallmark of a header in Gutentexts. Presumably, 
therefore, you are trying to detect headers in some  marked-up text that 
uses some sort of presentational markup.

Given your assumption that headers are 1. conspicuous, 2. hard to miss, 
and 3. easy to find (all variations on a theme), it seems to me that the 
best way to detect a header is to determine the general characteristics 
of the majority of all paragraphs in a document (size, indentation, 
amount of punctuation, location of punctuation, capitalization, etc.) 
and identify as headers any "paragraphs" which fall way outside the mean.

I presume you have a reliable way to identify paragraphs (not always 
possible when using text derived from PDF files).

Consider the shortest verse of the Bible: "Jesus wept." Biblical verses 
are merely numbered paragraphs. Can your algorithm determined that it is 
a paragraph and not a header? This is the problem of the false positive: 
it is as important to identify not-headers as it is to identify headers.

You would be much more likely to increase your list of special cases if 
you would share the thirty-odd special cases you have already identified.

From nwolcott at dsdial.net  Thu Jun 23 10:32:23 2005
From: nwolcott at dsdial.net (N Wolcott)
Date: Thu Jun 23 10:33:40 2005
Subject: [gutvol-d] Volunteers in New Jersey area?
Message-ID: <001201c57819$8abec200$bd9495ce@gw98>

Are there any PG volunteers near Rutgers University in New Jersey. Need some one wo scan a few pages of microfilm to disc or email, there is no charge for this apparently. 


N Wolcott  nwolcott2@post.harvard.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050623/a2666a00/attachment.html
From Bowerbird at aol.com  Thu Jun 23 14:58:09 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Thu Jun 23 14:58:27 2005
Subject: [gutvol-d] header detection revisited
Message-ID: <fd.16372b2e.2fec8a71@aol.com>

lee said:
>   The question is a bit ambiguous.

only if you haven't been following the drama for the last year-and-a-half.
welcome to this listserve, lee.  have you dropped the handle for good now?
it's been some time since we chatted, especially frontchannel...


>   What are you trying to detect headers _from_? 
>   AFAICT, Gutenberg e-texts don't have big and don't have bold, 
>   so neither can be the hallmark of a header in Gutentexts. 

that's right.  so for that i need to call on some of the other items
in my 30-item checklist.  the very best way to detect headers in
a p.g. e-text is to test for blank-lines above the line in question.

three blank lines will grab almost all of the headers, as well as
a dose of false-alarms.  the job then is to toss the false-alarms,
and to do the best job possible of discerning the missed headers.

and actually, in perhaps 25%-30% of project gutenberg's e-texts,
pulling lines that start with "chapter" will net most headers.    :+)


>   Presumably, therefore, you are trying to detect headers in some  
>   marked-up text that uses some sort of presentational markup.

"markup" doesn't usually enter into the equation.  it can, of course,
but if something has been marked up, a good way to find the headers
is to examine the markup.  nonetheless, i _can_ use my system on
the _presentation_ of text that has been marked-up; many of my 
examples will be just that.  as such, it can be used in cases where
the mark-up is not available, for one reason (print) or another (.pdf),
but its presentation is.  but of more direct concern to this listserve,
however, is its application toward the task that many people here do,
which for the most part is to digitize text from scans of paper-books.

a routine that recognizes headers in o.c.r. output -- because they are
relatively big and/or set in bold -- saves the digitizer from that chore.
i haven't discussed the importance revolving around header-recognition,
so that might not seem like a big deal.  but it is indeed rather important.
(any e-book programmer, like yourself, lee, knows why it's important.)

and, getting back again to the existing e-texts -- some 16,000+ now --
a routine for determining the headers in them would be quite valuable...

if you're looking for a general overview, i focus on 3 distinct arenas:
1.  strict z.m.l., where header-structure is defined by certain rules.
2.  "fuzzy" mode, where texts are somewhat consistent, but not always.
3.  "wild" texts, where all bets are off and you do the best that you can.

project gutenberg's e-texts generally fall in the second category.
as the examples i give will show, it would be relatively easy for me
to make software that inputs text from the second category and then
modifies it and outputs a file conforming to the strict first category.
but nobody from project gutenberg took me up on my offer to do that...

i've done enough work on arena #3 to know that it will be possible,
although you can't expect perfect output from the tool on a wild text.

i largely abandoned arena #2 when project gutenberg people passed,
although there will be wide-ranging applicability of this arena on
texts with some kind of regularity in them, such as listserve digests.

but my main focus now is on spreading the gospel of arena #1 -- z.m.l.
in z.m.l., headers are indicated simply by having blank lines above them.
(and the more blank lines, the higher the priority-level of the heading,
so it's a cinch to handle even the most complex of heading-structures.)

this simplicity means that it's easy to write fast code to find headers
in a z.m.l. file, and it's simple for users to understand how to make 'em.

there is still a big explosion of self-publishing that will be happening,
and i want to spare all those new writers the pains of doing mark-up.
i'd much rather have them concentrating on their _content_ instead!

once i've got all the tools in place to do what i want with arena #1,
i'll return to arena #3.  being able to take text "from the wild" and
ascertain its underlying structure, and then output it in strict z.m.l.,
so it can be handled with my tools, will be an awesome achievement.

again, this is an arena where markup is impractical, perhaps impossible.
consider all the content that is being generated _every_single_day_ on
yahoogroups.  nobody's going to mark-up all that content, so we need to
have a way of pulling it into our e-books and have it be nicely formatted.


>   Given your assumption that headers are 
>   1. conspicuous, 
>   2. hard to miss, and 
>   3. easy to find 
>   (all variations on a theme)

thanks for noticing the theme...         ;+)

but it's not really an _assumption_.  (nice try to spin it that way, though.)
it's actually an _observation_ on the very _nature_ of _being_ a _header_,
one of those things that seems totally obvious once realized and verbalized.

and of course, once you have realized that headers are _hard-to-miss_,
it becomes very silly to maintain that it is "impossible" to detect them.
of course you can detect them -- because they stick out like sore thumbs!


>   it seems to me that the best way to detect a header is to 
>   determine the general characteristics of the majority of 
>   all paragraphs in a document (size, indentation, amount of 
>   punctuation, location of punctuation, capitalization, etc.)  and 
>   identify as headers any "paragraphs" which fall way outside the mean.

now you're thinking.

looks like you're on your way to replicating my 30-item checklist.


>   I presume you have a reliable way to identify paragraphs 
>   (not always possible when using text derived from PDF files).

well, yes.

and the fact that text copied out of a .pdf loses its blank lines --
which then makes paragraph-detection exceedingly more difficult,
-- does indeed make the detection of headers more difficult as well.

which means you have to solve the paragraph-detection problem first,
as best as you can, anyway, with text that you've copied out of a .pdf.

restoring the paragraphs is a much bigger task than detecting headers.
if you can't perform that hard task for end-users, why do the easy one?
but the solution isn't as hard as you might think, although it's not 100%.
when i'm done discussing headers, if you want to discuss this, we can...

and besides, dealing with text copied out of a .pdf is not a high priority.
the best way to deal with _that_ kind of text is to go to the producer
and say, "can i instead have the file that you used to produce the .pdf?"

but even without having solving this .pdf paragraph-detection problem,
-- i.e., with all blank-lines removed -- my checklist does pretty well...


>   Consider the shortest verse of the Bible: "Jesus wept." 
>   Biblical verses are merely numbered paragraphs. 
>   Can your algorithm determined that it is a paragraph and not a header? 

um yeah.  "headers" in the bible are "paragraphs" that are not numbered.
and -- as you yourself just pointed out -- the actual verses are.  voila.


>   This is the problem of the false positive: 

>   it is as important to identify not-headers as it is to identify headers.

yes it is.  and much of the 30-item checklist is attuned to that issue.
once you've accepted that this is part of the job, it's not all that hard.


>   You would be much more likely to 
>   increase your list of special cases 
>   if you would share the thirty-odd 
>   special cases you have already identified.

i haven't identified "thirty-odd special cases".
i've abstracted 30 rules that act in combination
to answer the question at hand -- is this a header?

and it wasn't that hard.  you can probably come up with 10-15
right off the top of your head, without even thinking too much.

and if you subjected those to empirical testing on lots of e-texts,
as i have over the course of the last 2-3 years, you would probably
discover the rest of my 30 items.  and then you too would be saying,
"it's not impossible, folks, and in fact, it's not even all that difficult."
there's no magic here.  just hard work...

-bowerbird
From hart at pglaf.org  Sun Jun 26 09:34:16 2005
From: hart at pglaf.org (Michael Hart)
Date: Sun Jun 26 09:34:19 2005
Subject: [gutvol-d] Derivative works, or, what is copyrightable?
In-Reply-To: <15cfa2a505061914061d3cfda9@mail.gmail.com>
References: <15cfa2a505061914061d3cfda9@mail.gmail.com>
Message-ID: <Pine.LNX.4.60.0506260925390.10521@pglaf.org>


We recently discussed the non-copyrightablity concerning new reproductions
of old works, as per the recent court case of:

Bridgeman Art Library v. Corel Corp

In which it was determined that any reproductions of public domain works
that were attempting to accurately reproduce the original works were not
copyrightable, and this should be applicable here, as far as I can tell.


I am not a lawyer. . .this is NOT a legal opinion or legal advice.

IANAL = I am not a lawyer.

However, I am sending this to two of our legal advisors for comment.

Meanwhile, I will append the previous message concering


Bridgeman Art Library v. Corel Corp


below this message.


Michael


On Sun, 19 Jun 2005, Robert Cicconetti wrote:

> I've been going through my files trying to close out some partially
> finished projects. I have several Beatrix Potter books that had
> missing or damaged pages, and I went to the library today to try to
> fill in the blanks.
>
> Unfortunately, they only had the newer editions with a modern
> copyright, claiming a copyright because they had made a new transfer
> of the old watercolors. As far as I understand copyright law, this
> claim is bogus; a derivative work must be different enough from the
> original to be considered a new work; a slight technical improvement
> on the reproduction is not enough. An original lithograph, sure, but
> not making new screens. This may or may not be complicated by the fact
> that the publisher operates both out of the UK and the US.
>
> Is my understanding correct enough to go through with an official
> clearance request? Or shall I hunt for older copies? Potter books are
> not rare, but finding the older ones is more difficult.
>
> Thanks,
> R C


    To read the court decision, see Bridgeman Art Library v. Corel Corp,
    36 F. Supp. 2d 191 (S.D.N.Y. 1999)

    This article by the American Association of Museums states in blunt
    terms that they expect the Bridgeman decision to stand. In fact they
    never brought a lawsuit like this, and asked Bridgeman to drop their
    suit, because they knew the decision would go against them. I will
    spare you my opinion about claiming to own something you know belongs
    to the public domain.

Bridgeman Art Library v Corel Corp

    Many collage artists use reproductions of museum art in their work,
    assuming that a painting created hundreds of years ago must be in the
    public domain.

    To their chagrin, artists who try to publish such work have discovered
    that even if the original art is public domain, all existing
    reproductions are under copyright. This renders the original work
    completely out of reach, regardless of whether it is technically
    public domain.

    Museums prevent the viewing public from photographing art in their
    collections for many reasons, such as the expense and inconvenience of
    moving their art so it can be photographed. And more importantly, to
    preserve a monopoly over reproductions. Museums derive substantial
    income from posters, greeting cards, mouse pads etc. Naturally they
    want to protect their intellectual property.

    However, a recent court case may have shed new light on the situation.
    Bridgeman Art Library is a British company which licenses
    transparencies of museum art. In 1998, Bridgeman sued Corel, claiming
    that Corel's CD of fine art reproductions infringed on Bridgeman's
    copyright.

    The court determined that museum reproductions, whose purpose is to
    duplicate the original work as precisely as possible, do not involve
    enough originality to be copyrighted as a derivative work. In other
    words, a museum reproduction of fine art in the public domain is
    itself public domain, and unauthorized duplication of the reproduction
    is not copyright infringement.

    High-quality photography involves a great deal of skill and effort.
    That may make this decision seem unfair. After all, what is the point
    of going to all that work? A high quality reproduction has no more
    protection than an amateur snapshot. Probably less, since a snapshot
    will likely include elements (like an odd perspective or someone
    standing next to the artwork) that would qualify as originality.

    The court made a distinction between skill and originality. It may
    require an immense amount of skill to create a photograph that
    precisely duplicates a work of art. But, the court said, "'sweat of
    the brow' alone is not the 'creative spark' which is the sine qua non
    of originality." An exact duplicate deserves no more copyright
    protection than a photocopy.

    The decision noted that "There is little doubt that many photographs,
    probably the overwhelming majority, reflect at least the modest amount
    of originality required for copyright protection...." However,

    "Plaintiff by its own admission has labored to create "slavish copies"
    of public domain works of art. While it may be assumed that this
    required both skill and effort, there was no spark of originality --
    indeed, the point of the exercise was to reproduce the underlying
    works with absolute fidelity. Copyright is not available in these
    circumstances."

    Speaking about this case, an attorney for the American Association of
    Museums said: "Just about every museum attorney looking at the case
    objectively thinks it came out the correct way according to U.S.
    copyright law -- that's why no museum had ever brought such a suit....
    It would have been unwise for AAM to be on Bridgeman's side in this
    case because it would have undermined our credibility."

    Some important points to note:
      * Bridgeman v Corel affects only United States law. If you intend to
        publish your work in other countries besides the US, I would not
        recommend using this case as a guideline for legal use.
      * Bridgeman v Corel does not affect the law regarding photographs of
        three-dimensional works of art. The decision specifically
        addresses only two-dimensional works, where the goal is to
        duplicate the original as closely as possible. Photographing
        sculpture involves decisions about position, backdrop, lighting
        etc., all of which would probably make the photograph pass the
        "originality" test. However, this case does not discuss it one way
        or the other.
      * Bridgeman v Corel does not suggest that all museum reproductions
        are in the public domain. If the original is still under
        copyright, then so is the reproduction.
      * Bridgeman v Corel does not mean that you cannot be sued. Anyone
        can sue for any reason, whether or not they expect to win. (In
        fact, sometimes the threat of legal action is used as a bullying
        tactic, without any concern for who would win in court.) It does
        mean that you can copy museum reproductions of historical art in
        good faith.


    < back :: next >

    copyright ? 2000, 2001 by Sarah Ovenall. All rights reserved.
From cannona at fireantproductions.com  Sun Jun 26 19:24:21 2005
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Sun Jun 26 19:24:55 2005
Subject: [gutvol-d] PG Cookbook
Message-ID: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>

I was thinking a few days ago about how PG has attracted volunteers from 
all over the world from different backgrounds, and I got to thinking that 
it might be kind of fun if PG were to compile a cookbook containing the 
favorite recipes from our volunteers.  Since, in most cases, recipes can't 
be copyrighted, there shouldn't be any problem in that regard.  I'll bet we 
could get a pretty sizable and diverse collection if we put the word out on 
this list and at DP.

Anyway, it's just an idea.  Thoughts?  Any interest?

Sincerely
Aaron Cannon


--
E-mail: cannona@fireantproductions.com
Skype: cannona
MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) 


From hacker at gnu-designs.com  Sun Jun 26 19:30:14 2005
From: hacker at gnu-designs.com (David A. Desrosiers)
Date: Sun Jun 26 19:30:55 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
Message-ID: <Pine.LNX.4.62.0506262227580.14170@angst.gnu-designs.com>


> I was thinking a few days ago about how PG has attracted volunteers 
> from all over the world from different backgrounds, and I got to 
> thinking that it might be kind of fun if PG were to compile a 
> cookbook containing the favorite recipes from our volunteers.  
> Since, in most cases, recipes can't be copyrighted, there shouldn't 
> be any problem in that regard.  I'll bet we could get a pretty 
> sizable and diverse collection if we put the word out on this list 
> and at DP.

	I'd be more than happy to compile this into a mobile version 
using Plucker, to beam/share with anyone who cares to read and 
distribute it. I've done quite a few for many other projects, which 
you can see some screenshots and samples of here: 

	http://code.plkr.org/

	Just let me know when its ready and I'll do the conversion to 
Plucker format (its not usually a straight-up conversion, in most 
cases, it requires some reformatting of the contents, adding a TOC, 
and many other subtle things).


David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
From JBuck814366460 at aol.com  Sun Jun 26 20:36:07 2005
From: JBuck814366460 at aol.com (Jared Buck)
Date: Sun Jun 26 20:36:21 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
Message-ID: <42BF7427.4010007@aol.com>

Definately some interest from me, Aaron :)  Got quite a few recipes I 
CAN share, including that for my world-famous(I hope) chocolate chip 
cookies! :)

Jared

Aaron Cannon wrote on 6/26/2005, 7:24 PM:

 > I was thinking a few days ago about how PG has attracted volunteers from
 > all over the world from different backgrounds, and I got to thinking that
 > it might be kind of fun if PG were to compile a cookbook containing the
 > favorite recipes from our volunteers.  Since, in most cases, recipes
 > can't
 > be copyrighted, there shouldn't be any problem in that regard.  I'll
 > bet we
 > could get a pretty sizable and diverse collection if we put the word
 > out on
 > this list and at DP.
 >
 > Anyway, it's just an idea.  Thoughts?  Any interest?
 >
 > Sincerely
 > Aaron Cannon
 >
 >
 >
 >
 > --
 > E-mail: cannona@fireantproductions.com
 > Skype: cannona
 > MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail
 > address.)
 >
 >
 > _______________________________________________
 > gutvol-d mailing list
 > gutvol-d@lists.pglaf.org
 > http://lists.pglaf.org/listinfo.cgi/gutvol-d
 >


From tb at baechler.net  Sun Jun 26 23:34:53 2005
From: tb at baechler.net (Tony Baechler)
Date: Sun Jun 26 23:33:07 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions. com>
Message-ID: <5.2.0.9.0.20050626233126.03fc5e40@bisinc.us>

Hello.  Well, while I don't specifically have any favorites, I have over 
163,000 recipes I would be willing to donate if that helps.  I have no idea 
of the copyright status of them though.

Also, could you please elaborate on why recipes can't be 
copyrighted?  Specifically, could you please tell me in which cases recipes 
can be protected by copyright?  I have thought for many years about making 
recipes, either individually or in cookbook form available in Braille or 
similar formats for the blind, but I was always worried about the legal 
issues.  The laws are very specific on how copyrighted works may be put 
into formats such as Braille and I have no money or means to defend myself 
in case of suits.  You may write off list if you would like.

From traverso at dm.unipi.it  Sun Jun 26 23:57:15 2005
From: traverso at dm.unipi.it (Carlo Traverso)
Date: Sun Jun 26 23:51:48 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <5.2.0.9.0.20050626233126.03fc5e40@bisinc.us> (message from Tony
	Baechler on Sun, 26 Jun 2005 23:34:53 -0700)
References: <5.2.0.9.0.20050626233126.03fc5e40@bisinc.us>
Message-ID: <200506270657.j5R6vFE13175@pico.dm.unipi.it>


IANAL, but with common sense I would say that:

1) a collection of recipes gets a copyright.

2) the exact wording of a recipe can get a copyright; but the recipe
   itself (as a description of a procedure) does not have a copyright.

3) the recipe itself (i.e. the final product) can be patented or
   trademarked. 

Carlo

From cannona at fireantproductions.com  Mon Jun 27 06:17:55 2005
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Mon Jun 27 06:22:57 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <5.2.0.9.0.20050626233126.03fc5e40@bisinc.us>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions. com>
	<5.2.0.9.0.20050626233126.03fc5e40@bisinc.us>
Message-ID: <6.2.1.2.0.20050627081606.03fe0b78@mail.fireantproductions.com>

At 01:34 AM 6/27/2005, you wrote:
>Hello.  Well, while I don't specifically have any favorites, I have over 
>163,000 recipes I would be willing to donate if that helps.  I have no 
>idea of the copyright status of them though.

I think, for this compilation, we're aiming for quality, rather than 
quantity.  But if you have a few particular favorites...


>Also, could you please elaborate on why recipes can't be 
>copyrighted?  Specifically, could you please tell me in which cases 
>recipes can be protected by copyright?  I have thought for many years 
>about making recipes, either individually or in cookbook form available in 
>Braille or similar formats for the blind, but I was always worried about 
>the legal issues.  The laws are very specific on how copyrighted works may 
>be put into formats such as Braille and I have no money or means to defend 
>myself in case of suits.  You may write off list if you would like.

The relevant web site for the US is here:
http://www.copyright.gov/fls/fl122.html

Sincerely
Aaron Cannon


>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d


--
E-mail: cannona@fireantproductions.com
Skype: cannona
MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) 


From j.hagerson at comcast.net  Mon Jun 27 06:34:13 2005
From: j.hagerson at comcast.net (John Hagerson)
Date: Mon Jun 27 06:34:20 2005
Subject: [gutvol-d] Amazon offers 1082 volume Penguin Classics for $7,989
Message-ID: <002001c57b1c$e8508920$0200a8c0@sarek>

http://slashdot.org/article.pl?sid=05/06/27/0632258&from=rss


From jon at noring.name  Mon Jun 27 09:15:02 2005
From: jon at noring.name (Jon Noring)
Date: Mon Jun 27 09:14:52 2005
Subject: [gutvol-d] Amazon offering of the complete "Penguin Classics
	Library"
Message-ID: <1382534655.20050627101502@noring.name>

Refer to:

http://online.wsj.com/public/article/0,,SB111921715006463546-S0zI_EVookezthz8VC7m_WXjOAo_20060627,00.html?mod=blogs

Fair Use snippet from above article:

"We get a lot fewer random Amazon.com links sent to us since the great
Henry Raddick stopped writing book reviews, something we're still
mourning. But this one was jaw-dropping: The Penguin Classics Library
Complete Collection, consisting of 1,082 books. List price: $13,317.74.
Discount price: $7,989.99. Never has a 40% discount seemed quite so
weighty."

I think the interest to PG and DP is obvious. :^)

Jon Noring

From collin at xs4all.nl  Mon Jun 27 14:03:44 2005
From: collin at xs4all.nl (Branko Collin)
Date: Mon Jun 27 13:49:44 2005
Subject: [gutvol-d] Amazon offers 1082 volume Penguin Classics for $7,989
In-Reply-To: <002001c57b1c$e8508920$0200a8c0@sarek>
Message-ID: <42C085D0.26479.29B00D4@localhost>

On 27 Jun 2005, at 8:34, John Hagerson wrote:

> http://slashdot.org/article.pl?sid=05/06/27/0632258&from=rss

I noticed the server was a little slow this afternoon (CET). :-)

-- 
branko collin
collin@xs4all.nl
From collin at xs4all.nl  Mon Jun 27 14:20:55 2005
From: collin at xs4all.nl (Branko Collin)
Date: Mon Jun 27 14:06:54 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <Pine.LNX.4.62.0506262227580.14170@angst.gnu-designs.com>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
Message-ID: <42C089D7.14465.2AABCF4@localhost>


??? wrote:
> > I was thinking a few days ago about how PG has attracted volunteers
> > from all over the world from different backgrounds, and I got to
> > thinking that it might be kind of fun if PG were to compile a
> > cookbook containing the favorite recipes from our volunteers. 
> > Since, in most cases, recipes can't be copyrighted, there shouldn't
> > be any problem in that regard.  I'll bet we could get a pretty
> > sizable and diverse collection if we put the word out on this list
> > and at DP.

Sounds like a fun idea.

However, I thought PG policy was to not publish previously 
unpublished works? Do I remember that correctly, and, if so, how 
would that influence this project?

Perhaps PG needs to have a sister project with almost exactly the 
same goals, except that it will publish Vanity Press.

Or were you talking about volunteers taking their recipes from the 
cookbooks in PG?

-- 
branko collin
collin@xs4all.nl
From marcello at perathoner.de  Mon Jun 27 15:06:39 2005
From: marcello at perathoner.de (Marcello Perathoner)
Date: Mon Jun 27 15:06:50 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <42C089D7.14465.2AABCF4@localhost>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
	<42C089D7.14465.2AABCF4@localhost>
Message-ID: <42C0786F.6000105@perathoner.de>

Branko Collin wrote:

>>>I was thinking a few days ago about how PG has attracted volunteers
>>>from all over the world from different backgrounds, and I got to
>>>thinking that it might be kind of fun if PG were to compile a
>>>cookbook containing the favorite recipes from our volunteers. 

> However, I thought PG policy was to not publish previously 
> unpublished works? Do I remember that correctly, and, if so, how 
> would that influence this project?
> 
> Perhaps PG needs to have a sister project with almost exactly the 
> same goals, except that it will publish Vanity Press.

This has already been done:

   http://en.wikibooks.org/wiki/Cookbook


Of course, I could come up with better Italian recipes than they :-)


-- 
Marcello Perathoner
webmaster@gutenberg.org

From cannona at fireantproductions.com  Mon Jun 27 16:11:58 2005
From: cannona at fireantproductions.com (Aaron Cannon)
Date: Mon Jun 27 16:12:53 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <42C0786F.6000105@perathoner.de>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
	<42C089D7.14465.2AABCF4@localhost> <42C0786F.6000105@perathoner.de>
Message-ID: <6.2.1.2.0.20050627180746.041defb0@mail.fireantproductions.com>

The main idea of the project would be to recognize the diverse variety of 
volunteers, and not just put together a collection of recipes.  Still, the 
point about the vanity publishing is a good one.

Sincerely
Aaron Cannon


At 05:06 PM 6/27/2005, you wrote:
>Branko Collin wrote:
>
>>>>I was thinking a few days ago about how PG has attracted volunteers
>>>>from all over the world from different backgrounds, and I got to
>>>>thinking that it might be kind of fun if PG were to compile a
>>>>cookbook containing the favorite recipes from our volunteers.
>
>>However, I thought PG policy was to not publish previously unpublished 
>>works? Do I remember that correctly, and, if so, how would that influence 
>>this project?
>>Perhaps PG needs to have a sister project with almost exactly the same 
>>goals, except that it will publish Vanity Press.
>
>This has already been done:
>
>   http://en.wikibooks.org/wiki/Cookbook
>
>
>Of course, I could come up with better Italian recipes than they :-)
>
>
>--
>Marcello Perathoner
>webmaster@gutenberg.org
>
>_______________________________________________
>gutvol-d mailing list
>gutvol-d@lists.pglaf.org
>http://lists.pglaf.org/listinfo.cgi/gutvol-d


--
E-mail: cannona@fireantproductions.com
Skype: cannona
MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) 


From brad at chenla.org  Tue Jun 28 20:26:16 2005
From: brad at chenla.org (Brad Collins)
Date: Tue Jun 28 20:26:48 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
	(Aaron Cannon's message of "Sun, 26 Jun 2005 21:24:21 -0500")
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
Message-ID: <7jgddesn.fsf@chenla.org>

Aaron Cannon <cannona@fireantproductions.com> writes:

> I was thinking a few days ago about how PG has attracted volunteers
> from all over the world from different backgrounds, and I got to
> thinking that it might be kind of fun if PG were to compile a cookbook
> containing the favorite recipes from our volunteers.  Since, in most
> cases, recipes can't be copyrighted, there shouldn't be any problem in
> that regard.  I'll bet we could get a pretty sizable and diverse
> collection if we put the word out on this list and at DP.
>
> Anyway, it's just an idea.  Thoughts?  Any interest?

This reminds me of a story.

Back in the 90's a friend of mine was doing an environmental study for
China Light & Power or the Hong Kong Gov.  They were trying to put
together a inventory of species of fish in Hong Kong waters.  Finding
the latin and English names for the fish was easy, but they were
supposed to do everything in both English and Chinese so they went
around the office asking people for the Chinese names for the
different types of fish.

No one could seem to remember the names for any of the fish but all of
them could think of a recipes for each fish.....  It turned out that
there was no agreement on names and that each little fishing village
and southern dialect had their own names for each type of fish.

In the end they gave up and proposed making a cookbook of the recipes
everyone had offered.  I never heard if anything came of the
cookbook....

b/

-- 
Brad Collins <brad@chenla.org>, Bangkok, Thailand
From holden.mcgroin at dsl.pipex.com  Tue Jun 28 23:16:28 2005
From: holden.mcgroin at dsl.pipex.com (Holden McGroin)
Date: Tue Jun 28 23:16:13 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <7jgddesn.fsf@chenla.org>
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
	<7jgddesn.fsf@chenla.org>
Message-ID: <42C23CBC.5020405@dsl.pipex.com>

Brad Collins wrote:
> Back in the 90's a friend of mine was doing an environmental study for
> China Light & Power or the Hong Kong Gov.  They were trying to put
> together a inventory of species of fish in Hong Kong waters.  Finding
> the latin and English names for the fish was easy, but they were
> supposed to do everything in both English and Chinese so they went
> around the office asking people for the Chinese names for the
> different types of fish.
> 
> No one could seem to remember the names for any of the fish but all of
> them could think of a recipes for each fish.....  It turned out that
> there was no agreement on names and that each little fishing village
> and southern dialect had their own names for each type of fish.
> 
> In the end they gave up and proposed making a cookbook of the recipes
> everyone had offered.  I never heard if anything came of the
> cookbook....

Which reminds me of one of my favourite quotes by HRH the Duke of Edinburgh:

"If it has four legs and is not a chair, has wings and is not an aeroplane, or swims and 
is not a submarine, the Cantonese will eat it."

Cheers,
Holden
From Bowerbird at aol.com  Wed Jun 29 10:37:20 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Jun 29 10:37:38 2005
Subject: [gutvol-d] Greetings ebook makers ;)
Message-ID: <d8.28d1d229.2ff43650@aol.com>

jeffrey said:
>   Hello fellow ebook creators,

hello jeffrey.  did anyone respond?

-bowerbird
From jefferydouglaswaddell at gmail.com  Wed Jun 29 11:38:38 2005
From: jefferydouglaswaddell at gmail.com (Jeff Waddell)
Date: Wed Jun 29 11:38:48 2005
Subject: [gutvol-d] Greetings ebook makers ;)
In-Reply-To: <d8.28d1d229.2ff43650@aol.com>
References: <d8.28d1d229.2ff43650@aol.com>
Message-ID: <8a44f71c05062911383891d02c@mail.gmail.com>

I have had some positive response from other forums, venues, and 
individuals. None of which has lead to anything resembling a "job". This 
would be the first sign of a response from the gutenberg community. Do you 
have any comments or suggestions?
 Jeff

 On 6/29/05, Bowerbird@aol.com <Bowerbird@aol.com> wrote: 
> 
> jeffrey said:
> > Hello fellow ebook creators,
> 
> hello jeffrey. did anyone respond?
> 
> -bowerbird
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050629/1798f98f/attachment.html
From Bowerbird at aol.com  Wed Jun 29 12:09:59 2005
From: Bowerbird at aol.com (Bowerbird@aol.com)
Date: Wed Jun 29 12:10:14 2005
Subject: [gutvol-d] Greetings ebook makers ;)
Message-ID: <1e.485ac604.2ff44c07@aol.com>

jeffery said:
>   This would be the first sign of a response 
>   from the gutenberg community. 

well, "the gutenberg community" doesn't consider me to
be a part of it, so i guess you are still waiting for them.         :+)

in fact, since you've now soiled your trousers by
even speaking to me, they will probably tell you
that they are ignoring you for _that_ reason.


>   Do you have any comments or suggestions?

motivate yourself, because they won't give you any help.          :+)

-bowerbird

p.s.  sorry i spelled your name wrong before...
From Gutenberg9443 at aol.com  Wed Jun 29 15:03:36 2005
From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com)
Date: Wed Jun 29 15:03:53 2005
Subject: [gutvol-d] PG Cookbook
Message-ID: <199.422dd536.2ff474b8@aol.com>

Announcement:
 
I am not going to edit that cookbook. If I ever edit a cookbook,
it will be one I wrote.
 
However, for anybody who actually makes a decision to
make a cookbook, here is my recipe. I call it
 
I-am-worn-out-and-it-is-hot-as-h***-high-protein high-fiber salad:
 
Chill one can of pork'n'beans.
Chill one can of whole-kernel corn.
Drain corn and empty into large bowl.
Add undrained pork'n'beans.
Chop up however many tomatoes and fresh onions you want to put in it.
Add black olives and/or green stuffed olives and/or whatever else you  want.
Add celery or lettuce or whatever else  you want.
Toss in mayonnaise or Russian dressing or Italian dressing or whatever else  
you want.
Eat with corn chips or potato chips or no chips at all.
 
I am the only person I know who eats this. Everybody gives me that "are you  
out of your mind" look if I tell them about it. But the combination of beans 
and  corn creates complete protein. Whatever veggies you decide to put in it 
are, of  course, veggies. So it's a reasonably high-nutrition meal. You might 
drink milk  with it or add diced cheese, as it is low in calcium. Or you might 
get your  calcium by eating Tums after it, if your digestion isn't as fond of 
fiber as  mine is.
 
Anne
 
 
Anne

Do you like to  breathe?
Then save the trees! 
Begin a personal relationship
with an  ebook 
TODAY!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050629/2181675a/attachment.html
From brad at chenla.org  Wed Jun 29 17:54:42 2005
From: brad at chenla.org (Brad Collins)
Date: Wed Jun 29 17:55:13 2005
Subject: [gutvol-d] PG Cookbook
In-Reply-To: <42C23CBC.5020405@dsl.pipex.com> (Holden McGroin's message of
	"Wed, 29 Jun 2005 07:16:28 +0100")
References: <6.2.1.2.0.20050626211707.01cb3d68@mail.fireantproductions.com>
	<7jgddesn.fsf@chenla.org> <42C23CBC.5020405@dsl.pipex.com>
Message-ID: <3br0d5pp.fsf@chenla.org>

Holden McGroin <holden.mcgroin@dsl.pipex.com> writes:

> Brad Collins wrote:

> Which reminds me of one of my favourite quotes by HRH the Duke of Edinburgh:
>
> "If it has four legs and is not a chair, has wings and is not an
> aeroplane, or swims and is not a submarine, the Cantonese will eat
> it."
>

Interesting.  When I was working in Beijing I heard the entire quote a
number of times in Mandarin.  In Hong Kong it's usually shortened to
something like "the Chinese will eat anything with four legs except a
chair", with some pride, I might add.

I wonder if the quote is originally Chinese and perhaps heard by the
Duke from someone like Governor Wilson (who was known to have some
what of a clue about local culture) as opposed to Chris Patton who was
only made gov to piss off the mainland.

Sorry, I know this is getting even further OT.....

b/

-- 
Brad Collins <brad@chenla.org>, Bangkok, Thailand