From kouhia at nic.funet.fi Thu Sep 2 05:09:12 2004 From: kouhia at nic.funet.fi (Juhana Sadeharju) Date: Thu Sep 2 05:09:18 2004 Subject: [gutvol-d] Re: unauthorized PG venders Message-ID: >On Wed, 25 Aug 2004, Greg Newby wrote: > >> On Wed, Aug 25, 2004 at 02:34:06PM -0500, Aaron Cannon wrote: >>> Just found the following link on google. Is this permitted? I was under >>> the impression that the DVD was not supposed to be sold. >>> >>> http://www.baccarat-instructions.com/items/6913444126.html >> >> Resale for the DVD (unlike the CD) is not explicitly >> prohibited. >> >> He needs to pay trademark royalties, however, and to >> my knowledge has not done so. >> >> Sometimes Michael likes to go after such trademark infringers. So, what is this CD and DVD thing? I have never ended up to such issues with GNU software. What one should do when releasing PG etexts on CDs or DVDs? Pay royalties? How much? Why the permission to use PG trademark is not cost-free? Are people being greedy here? Would it be enough to remove every reference to Project Gutenberg? Yet again: Are the PG etexts free (in GNU like sense) or public domain? Who has copyrights to the etexts in the PG archives? I have a solution: Lets move all etexts to my project Truly Free Etexts. Then everyone can do anything with them, burn to CDs and DVDs and sell and re-sell them. The etexts would last forever and nobody can take the joy away -- this is what happens with GNU software. Of course, people should check twice to where contribute etexts. Apparently PG has not been the best place in terms of freedom. (Websites have re-copyrighted the PG etexts, and PG persons have started their own business, PG2, with other's contributions.) I would like to remind that it is great gift that old texts goes to public domain. Lets not abuse this gift. Keep them in public domain and spread the good word. Juhana -- http://music.columbia.edu/mailman/listinfo/linux-graphics-dev for developers of open source graphics software From joshua at hutchinson.net Thu Sep 2 05:40:11 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Sep 2 05:40:26 2004 Subject: [gutvol-d] Re: unauthorized PG venders Message-ID: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> Try understanding how things work before accusing people of "stealing" and being "greedy." PG texts are completely free. No rights reserved, no nothing. The PG TRADEMARK ("Project Gutenberg") is not free. If you create a CD with PG's trademark all over it, you are required to pay licensing fees for that trademark. If, however, you strip the PG trademark, you can do anything you want with those texts. The reason is basically two-fold. 1) PG does need a revenue stream to maintain is admittedly frugal operations. (In reality, the licensing accounts for almost nothing in revenue. Hence, the greedy quote is particularly laughable.) 2) (And more important, imo) PG has to defend its trademark and good name. If you are putting together a DVD of texts, but somehow do a flat out terrible job (ie, half the files on the DVD are corrupted), and PG's trademark is all over the place, we look bad. PG itself is getting tarnished by actions outside our control. By putting licensing over the trademark in place, it gives us *some* control over the content that bears our name. Josh ----- Original Message ----- From: Juhana Sadeharju Date: Thu, 2 Sep 2004 15:09:12 +0300 To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] Re: unauthorized PG venders >On Wed, 25 Aug 2004, Greg Newby wrote: > >> On Wed, Aug 25, 2004 at 02:34:06PM -0500, Aaron Cannon wrote: >>> Just found the following link on google. Is this permitted? I was under >>> the impression that the DVD was not supposed to be sold. >>> >>> http://www.baccarat-instructions.com/items/6913444126.html >> >> Resale for the DVD (unlike the CD) is not explicitly >> prohibited. >> >> He needs to pay trademark royalties, however, and to >> my knowledge has not done so. >> >> Sometimes Michael likes to go after such trademark infringers. So, what is this CD and DVD thing? I have never ended up to such issues with GNU software. What one should do when releasing PG etexts on CDs or DVDs? Pay royalties? How much? Why the permission to use PG trademark is not cost-free? Are people being greedy here? Would it be enough to remove every reference to Project Gutenberg? Yet again: Are the PG etexts free (in GNU like sense) or public domain? Who has copyrights to the etexts in the PG archives? I have a solution: Lets move all etexts to my project Truly Free Etexts. Then everyone can do anything with them, burn to CDs and DVDs and sell and re-sell them. The etexts would last forever and nobody can take the joy away -- this is what happens with GNU software. Of course, people should check twice to where contribute etexts. Apparently PG has not been the best place in terms of freedom. (Websites have re-copyrighted the PG etexts, and PG persons have started their own business, PG2, with other's contributions.) I would like to remind that it is great gift that old texts goes to public domain. Lets not abuse this gift. Keep them in public domain and spread the good word. Juhana -- http://music.columbia.edu/mailman/listinfo/linux-graphics-dev for developers of open source graphics software _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From brandon at corruptedtruth.com Thu Sep 2 07:36:23 2004 From: brandon at corruptedtruth.com (Brandon Galbraith) Date: Thu Sep 2 07:36:30 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> References: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> Message-ID: <41372FE7.6070102@corruptedtruth.com> Josh, Thank you for the excellent points you made below. I would have to say that it's mighty greedy for people to rip the books from PG, burn them to CD and DVD, and then go selling them on Ebay, thier website, etc It's just plain rude for people to walk all over the hard work Project Gutenberg does and then complain when it asks for a small token in return. Brandon Joshua Hutchinson wrote: >Try understanding how things work before accusing people of "stealing" >and being "greedy." > >PG texts are completely free. No rights reserved, no nothing. > >The PG TRADEMARK ("Project Gutenberg") is not free. If you create a >CD with PG's trademark all over it, you are required to pay licensing fees for >that trademark. If, however, you strip the PG trademark, you can do anything >you want with those texts. > >The reason is basically two-fold. > >1) PG does need a revenue stream to maintain is admittedly frugal operations. >(In reality, the licensing accounts for almost nothing in revenue. Hence, the >greedy quote is particularly laughable.) > >2) (And more important, imo) PG has to defend its trademark and good name. If >you are putting together a DVD of texts, but somehow do a flat out terrible job >(ie, half the files on the DVD are corrupted), and PG's trademark is all over >the place, we look bad. PG itself is getting tarnished by actions outside our >control. By putting licensing over the trademark in place, it gives us *some* >control over the content that bears our name. > >Josh > >----- Original Message ----- >From: Juhana Sadeharju >Date: Thu, 2 Sep 2004 15:09:12 +0300 >To: gutvol-d@lists.pglaf.org >Subject: [gutvol-d] Re: unauthorized PG venders > > > >>On Wed, 25 Aug 2004, Greg Newby wrote: >> >> >> >>>On Wed, Aug 25, 2004 at 02:34:06PM -0500, Aaron Cannon wrote: >>> >>> >>>>Just found the following link on google. Is this permitted? I was under >>>>the impression that the DVD was not supposed to be sold. >>>> >>>>http://www.baccarat-instructions.com/items/6913444126.html >>>> >>>> >>>Resale for the DVD (unlike the CD) is not explicitly >>>prohibited. >>> >>>He needs to pay trademark royalties, however, and to >>>my knowledge has not done so. >>> >>>Sometimes Michael likes to go after such trademark infringers. >>> >>> > >So, what is this CD and DVD thing? I have never ended up to such issues >with GNU software. What one should do when releasing PG etexts on CDs or >DVDs? Pay royalties? How much? Why the permission to use PG trademark >is not cost-free? Are people being greedy here? > >Would it be enough to remove every reference to Project Gutenberg? > >Yet again: Are the PG etexts free (in GNU like sense) or public domain? >Who has copyrights to the etexts in the PG archives? > >I have a solution: Lets move all etexts to my project Truly Free Etexts. >Then everyone can do anything with them, burn to CDs and DVDs and sell >and re-sell them. The etexts would last forever and nobody can take the >joy away -- this is what happens with GNU software. > >Of course, people should check twice to where contribute etexts. >Apparently PG has not been the best place in terms of freedom. >(Websites have re-copyrighted the PG etexts, and PG persons have >started their own business, PG2, with other's contributions.) > >I would like to remind that it is great gift that old texts >goes to public domain. Lets not abuse this gift. Keep them in >public domain and spread the good word. > >Juhana > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20040902/6984ccb3/attachment.html From hacker at gnu-designs.com Thu Sep 2 07:41:59 2004 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Sep 2 07:42:29 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: <41372FE7.6070102@corruptedtruth.com> References: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> <41372FE7.6070102@corruptedtruth.com> Message-ID: > Thank you for the excellent points you made below. I would have to say > that it's mighty greedy for people to rip the books from PG, burn them > to CD and DVD, and then go selling them on Ebay, thier website, etc It's > just plain rude for people to walk all over the hard work Project > Gutenberg does and then complain when it asks for a small token in > return. Along these lines, would be be sufficient to take a CD/DVD of PG, roll through each of the works in an automated fashion, convert them to a format that PG does not currently support, and sell THAT for a fee? (Where the fee covers the time to convert to the other format(s) as well as for distribution of the media itself, etc.) I've been asked several times by my user community to provide something very similar, but I would also like to make it VERY OBVIOUS that the works are from the PG project. As a long-time Free Software author and community contributor, I'm very well-aware of licensing, copyright, trademarks and their common misuse. d. From hart at pglaf.org Thu Sep 2 08:39:59 2004 From: hart at pglaf.org (Michael Hart) Date: Thu Sep 2 08:40:01 2004 Subject: [gutvol-d] Re: unauthorized PG venders (fwd) In-Reply-To: References: Message-ID: > From: Joshua Hutchinson > Try understanding how things work before accusing people of "stealing" > and being "greedy." > > PG texts are completely free. No rights reserved, no nothing. Actually, several hundred of the Project Gutenberg eBooks are copyrighted, and thus cannot be legally resold without receiving permission from the authors or copyright holders. > The PG TRADEMARK ("Project Gutenberg") is not free. If you create a CD with > PG's trademark all over it, you are required to pay licensing fees for that > trademark. If, however, you strip the PG trademark, you can do anything you > want with those texts. Trademark law basically forbids "trading on the good name" of the trademark holder. Thus is it illegal to resell PG eBooks without permission if you use the Project Gutenberg name to do so. Project Gutengerg is a registered trademark. > The reason is basically two-fold. > > 1) PG does need a revenue stream to maintain is admittedly frugal operations. > (In reality, the licensing accounts for almost nothing in revenue. Hence, > the greedy quote is particularly laughable.) > > 2) (And more important, imo) PG has to defend its trademark and good name. > If you are putting together a DVD of texts, but somehow do a flat out > terrible job (ie, half the files on the DVD are corrupted), and PG's > trademark is all over the place, we look bad. PG itself is getting tarnished > by actions outside our control. By putting licensing over the trademark in > place, it gives us *some* control over the content that bears our name. Yes, it is also illegal to diminish the trademark's value in this manner. I send off an inquiry about this, but, as usual, never received a reply. Legally I have to send off such a message to defend the trademark, if you don't, you end up losing the trademark, as with aspirin, ping pong, etc. Apirin was Bayer's trademark, Ping Pong was from Westinghouse, as I recall. mh > > Josh > > ----- Original Message ----- > From: Juhana Sadeharju > Date: Thu, 2 Sep 2004 15:09:12 +0300 > To: gutvol-d@lists.pglaf.org > Subject: [gutvol-d] Re: unauthorized PG venders > >> On Wed, 25 Aug 2004, Greg Newby wrote: >> >>> On Wed, Aug 25, 2004 at 02:34:06PM -0500, Aaron Cannon wrote: >>>> Just found the following link on google. Is this permitted? I was >>>> under >>>> the impression that the DVD was not supposed to be sold. >>>> >>>> http://www.baccarat-instructions.com/items/6913444126.html >>> >>> Resale for the DVD (unlike the CD) is not explicitly >>> prohibited. >>> >>> He needs to pay trademark royalties, however, and to >>> my knowledge has not done so. >>> >>> Sometimes Michael likes to go after such trademark infringers. > > So, what is this CD and DVD thing? I have never ended up to such issues > with GNU software. What one should do when releasing PG etexts on CDs or > DVDs? Pay royalties? How much? Why the permission to use PG trademark > is not cost-free? Are people being greedy here? > > Would it be enough to remove every reference to Project Gutenberg? > > Yet again: Are the PG etexts free (in GNU like sense) or public domain? > Who has copyrights to the etexts in the PG archives? > > I have a solution: Lets move all etexts to my project Truly Free Etexts. > Then everyone can do anything with them, burn to CDs and DVDs and sell > and re-sell them. The etexts would last forever and nobody can take the > joy away -- this is what happens with GNU software. > > Of course, people should check twice to where contribute etexts. > Apparently PG has not been the best place in terms of freedom. > (Websites have re-copyrighted the PG etexts, and PG persons have > started their own business, PG2, with other's contributions.) > > I would like to remind that it is great gift that old texts > goes to public domain. Lets not abuse this gift. Keep them in > public domain and spread the good word. > > Juhana > -- > http://music.columbia.edu/mailman/listinfo/linux-graphics-dev > for developers of open source graphics software > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From gbnewby at pglaf.org Thu Sep 2 11:33:47 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Sep 2 11:33:48 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: References: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> <41372FE7.6070102@corruptedtruth.com> Message-ID: <20040902183347.GA7791@pglaf.org> On Thu, Sep 02, 2004 at 10:41:59AM -0400, David A. Desrosiers wrote: > > >Thank you for the excellent points you made below. I would have to say > >that it's mighty greedy for people to rip the books from PG, burn them > >to CD and DVD, and then go selling them on Ebay, thier website, etc It's > >just plain rude for people to walk all over the hard work Project > >Gutenberg does and then complain when it asks for a small token in > >return. > > Along these lines, would be be sufficient to take a CD/DVD of PG, > roll through each of the works in an automated fashion, convert them to a > format that PG does not currently support, and sell THAT for a fee? (Where > the fee covers the time to convert to the other format(s) as well as for > distribution of the media itself, etc.) Formatting (or reformatting) is not the issue, it's use of the trademarked name. If someone creates, say, a PDF format eBook, the question is whether the PG trademark is used or not. The reformatting doesn't matter (except that it might change from the 15% to the 20% royalty level in the small print). BTW, the small print essentially says that anything not covered (such as CD or DVD collections) need to be negotiated separately. -- Greg > I've been asked several times by my user community to provide > something very similar, but I would also like to make it VERY OBVIOUS that > the works are from the PG project. > > As a long-time Free Software author and community contributor, I'm > very well-aware of licensing, copyright, trademarks and their common > misuse. > > d. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From nihil_obstat at mindspring.com Thu Sep 2 11:33:46 2004 From: nihil_obstat at mindspring.com (Dennis McCarthy) Date: Thu Sep 2 11:33:55 2004 Subject: [gutvol-d] Re: unauthorized PG venders (fwd) Message-ID: <1796100.1094150026849.JavaMail.root@wamui02.slb.atl.earthlink.net> (This is a bit off subject...) "Ping-Pong" the game is still trademarked by Parker Brothers. Dates back to early 1930s. That is why all generic sporting associations have the term "table tennis" in their names rather than ping-pong. Other "ping pong" trademarks are apparently out there, but not for games or sports--as that would confuse the consumer--products are too dissimilar. There used to be ping pong ice cream. I guess that there could be a legal trademark for "Project Gutenberg" ice cream or maybe a board game, but not a literature storage and distribution concern. "Asprin" was lost by Bayer. Someone told me that this is why today all generic names for medicines are cumbersome--that way when the patent expires, competetors have to sell a product with an ugly name. The original patent holder still has the "nice" sounding trademark. -----Original Message----- From: Michael Hart Sent: Sep 2, 2004 11:39 AM To: The gutvol-d Mailing List Subject: Re: [gutvol-d] Re: unauthorized PG venders (fwd) Legally I have to send off such a message to defend the trademark, if you don't, you end up losing the trademark, as with aspirin, ping pong, etc. Apirin was Bayer's trademark, Ping Pong was from Westinghouse, as I recall. mh > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From hacker at gnu-designs.com Thu Sep 2 11:47:32 2004 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Sep 2 11:48:30 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: <20040902183347.GA7791@pglaf.org> References: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> <41372FE7.6070102@corruptedtruth.com> <20040902183347.GA7791@pglaf.org> Message-ID: > Formatting (or reformatting) is not the issue, it's use of the > trademarked name. Agreed, but reformatting would likely include the trademark. > If someone creates, say, a PDF format eBook, the question is whether the > PG trademark is used or not. The reformatting doesn't matter (except > that it might change from the 15% to the 20% royalty level in the small > print). Ok, so our only real recourse, gives us the right to convert the works to another format (we're talking about handheld redistribution here) but not provide any links to PG, or use PG in any of the names or attribution? Wouldn't that be a copyright _and_ trademark violation, because we'd be forced to "remove" all references to PG from the actual works themselves. Obviously, I don't want to do this, because I respect and support the project. Doesn't that make it impossible, since our entire intention would be to include PG in each of the texts, to draw more readers of the works and generate some interest in the project. Ideally, and I can expand on this if anyone is specifically interested in the goals and design, we would be splitting the books up to include pages, chapters, TOC, attribution, and hopefully a PG logo emblazened across each work. I've got a schema and some code designed to suck the entire PG tree into the db, which we can then tokenize and output in any format we want. It isn't polished or "pretty" yet, we're still in the preliminary stages, but if we can't redistribute the converted works, giving credit to PG, the project is dead, full-stop. Let me know, because this puts a big kink in the works. d. From gbnewby at pglaf.org Thu Sep 2 11:57:07 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Sep 2 11:57:09 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: References: <20040902124011.883E72F95C@ws6-3.us4.outblaze.com> <41372FE7.6070102@corruptedtruth.com> <20040902183347.GA7791@pglaf.org> Message-ID: <20040902185707.GA8976@pglaf.org> On Thu, Sep 02, 2004 at 02:47:32PM -0400, David A. Desrosiers wrote: > > >Formatting (or reformatting) is not the issue, it's use of the > >trademarked name. > > Agreed, but reformatting would likely include the trademark. > > >If someone creates, say, a PDF format eBook, the question is whether the > >PG trademark is used or not. The reformatting doesn't matter (except > >that it might change from the 15% to the 20% royalty level in the small > >print). > > Ok, so our only real recourse, gives us the right to convert the > works to another format (we're talking about handheld redistribution here) > but not provide any links to PG, or use PG in any of the names or > attribution? Wouldn't that be a copyright _and_ trademark violation, > because we'd be forced to "remove" all references to PG from the actual > works themselves. Obviously, I don't want to do this, because I respect > and support the project. ?? who said you couldn't convert to another format ?? It's explicitly permitted. See http://gutenberg.net/license (the small print howto) -- Greg > Doesn't that make it impossible, since our entire intention would > be to include PG in each of the texts, to draw more readers of the works > and generate some interest in the project. > > Ideally, and I can expand on this if anyone is specifically > interested in the goals and design, we would be splitting the books up to > include pages, chapters, TOC, attribution, and hopefully a PG logo > emblazened across each work. > > I've got a schema and some code designed to suck the entire PG > tree into the db, which we can then tokenize and output in any format we > want. > > It isn't polished or "pretty" yet, we're still in the preliminary > stages, but if we can't redistribute the converted works, giving credit to > PG, the project is dead, full-stop. > > Let me know, because this puts a big kink in the works. > > d. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Thu Sep 2 12:33:04 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Sep 2 12:33:17 2004 Subject: [gutvol-d] Re: unauthorized PG venders Message-ID: <20040902193304.8B1FEEDE77@ws6-1.us4.outblaze.com> You can do all that just fine as long as you aren't charging for the redistribution. The money only comes into play if your are charging something for the redistributed works. Basically, if you're charging for it, we want our fair cut. If you're handing them out for free, no problem. See you on the flip side, dude! :) JHutch ----- Original Message ----- From: "David A. Desrosiers" Date: Thu, 2 Sep 2004 14:47:32 -0400 (EDT) To: gutvol-d@lists.pglaf.org Subject: Re: [gutvol-d] Re: unauthorized PG venders > > > Formatting (or reformatting) is not the issue, it's use of the > > trademarked name. > > Agreed, but reformatting would likely include the trademark. > > > If someone creates, say, a PDF format eBook, the question is whether the > > PG trademark is used or not. The reformatting doesn't matter (except > > that it might change from the 15% to the 20% royalty level in the small > > print). > > Ok, so our only real recourse, gives us the right to convert the > works to another format (we're talking about handheld redistribution here) > but not provide any links to PG, or use PG in any of the names or > attribution? Wouldn't that be a copyright _and_ trademark violation, > because we'd be forced to "remove" all references to PG from the actual > works themselves. Obviously, I don't want to do this, because I respect > and support the project. > > Doesn't that make it impossible, since our entire intention would > be to include PG in each of the texts, to draw more readers of the works > and generate some interest in the project. > > Ideally, and I can expand on this if anyone is specifically > interested in the goals and design, we would be splitting the books up to > include pages, chapters, TOC, attribution, and hopefully a PG logo > emblazened across each work. > > I've got a schema and some code designed to suck the entire PG > tree into the db, which we can then tokenize and output in any format we > want. > > It isn't polished or "pretty" yet, we're still in the preliminary > stages, but if we can't redistribute the converted works, giving credit to > PG, the project is dead, full-stop. > > Let me know, because this puts a big kink in the works. > > d. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From hacker at gnu-designs.com Thu Sep 2 12:41:56 2004 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Sep 2 12:42:30 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: <20040902193304.8B1FEEDE77@ws6-1.us4.outblaze.com> References: <20040902193304.8B1FEEDE77@ws6-1.us4.outblaze.com> Message-ID: > You can do all that just fine as long as you aren't charging for the > redistribution. The money only comes into play if your are charging > something for the redistributed works. Basically, if you're charging > for it, we want our fair cut. If you're handing them out for free, no > problem. See you on the flip side, dude! :) Right, and thats the catch.. We'd be handing the works out for free in person at LUGs and such, but for those people that want _our version_ of the converted works, which involves quite a bit of hand-editing as well as several passes of post-processing to convert them to our (open and documented) portable format, shipped to them on hard-media (CD or DVD), we would (potentially) charge for the media and for the time/effort expended to convert them all. We're not actually charging for the works themselves. I realize it seems grey, but it is distinctly different. d. From joshua at hutchinson.net Thu Sep 2 12:53:06 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Sep 2 12:53:18 2004 Subject: [gutvol-d] Re: unauthorized PG venders Message-ID: <20040902195306.82C6B2FA14@ws6-3.us4.outblaze.com> It comes down to two options for you, then. 1) Remove all mention of PG. 2) Send PG a check for 15% of whatever you are charging for your proprietary format. Use of of the PG trademark will cost you money in a commercial endeavour (which this is, if money changes hands, no matter what spin you try to put on it). Josh ----- Original Message ----- From: "David A. Desrosiers" Date: Thu, 2 Sep 2004 15:41:56 -0400 (EDT) To: gutvol-d@lists.pglaf.org Subject: Re: [gutvol-d] Re: unauthorized PG venders > > > You can do all that just fine as long as you aren't charging for the > > redistribution. The money only comes into play if your are charging > > something for the redistributed works. Basically, if you're charging > > for it, we want our fair cut. If you're handing them out for free, no > > problem. See you on the flip side, dude! :) > > Right, and thats the catch.. > > We'd be handing the works out for free in person at LUGs and such, > but for those people that want _our version_ of the converted works, which > involves quite a bit of hand-editing as well as several passes of > post-processing to convert them to our (open and documented) portable > format, shipped to them on hard-media (CD or DVD), we would (potentially) > charge for the media and for the time/effort expended to convert them all. > > We're not actually charging for the works themselves. > > I realize it seems grey, but it is distinctly different. > > d. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From gbnewby at pglaf.org Thu Sep 2 13:02:04 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Sep 2 13:02:05 2004 Subject: [gutvol-d] Re: unauthorized PG venders In-Reply-To: References: <20040902193304.8B1FEEDE77@ws6-1.us4.outblaze.com> Message-ID: <20040902200204.GA10809@pglaf.org> On Thu, Sep 02, 2004 at 03:41:56PM -0400, David A. Desrosiers wrote: > > >You can do all that just fine as long as you aren't charging for the > >redistribution. The money only comes into play if your are charging > >something for the redistributed works. Basically, if you're charging > >for it, we want our fair cut. If you're handing them out for free, no > >problem. See you on the flip side, dude! :) > > Right, and thats the catch.. > > We'd be handing the works out for free in person at LUGs and such, > but for those people that want _our version_ of the converted works, which > involves quite a bit of hand-editing as well as several passes of > post-processing to convert them to our (open and documented) portable > format, shipped to them on hard-media (CD or DVD), we would (potentially) > charge for the media and for the time/effort expended to convert them all. > > We're not actually charging for the works themselves. > > I realize it seems grey, but it is distinctly different. If you're doing reformatting and/or error correction, it would be nice to submit these changes back (email errata@pglaf.org). If it's just format conversion, it's probably not necessary. If you create new CD or DVD images, we would consider redistributing them. Our existing CD and DVD images have licenses (the CD is a Creative Commons, including for a compilation copyright); the DVD is just under the regular PG Small Print). As to whether royalties are due: this is something that can be directed to Michael Hart and I for decision. Typically we do not expect royalties if there is a fee, but only to cover costs of reproduction & distribution. Generally, we like to see all types of redistribution. What we don't like to see is trading on the Project Gutenberg name without our permission or without paying royalties as required. I hope this helps... once you have your plan worked out, please do get in touch. -- Greg Dr. Gregory B. Newby Chief Executive and Director Project Gutenberg Literary Archive Foundation http://gutenberg.net A 501(c)(3) not-for-profit organization with EIN 64-6221541 gbnewby@pglaf.org From walter.van.holst at xs4all.nl Thu Sep 2 21:48:24 2004 From: walter.van.holst at xs4all.nl (Walter H. van Holst) Date: Thu Sep 2 21:48:49 2004 Subject: [gutvol-d] Re: unauthorized PG venders (fwd) In-Reply-To: References: Message-ID: <1094186903.14077.8.camel@God> On Thu, 2004-09-02 at 17:39, Michael Hart wrote: > Trademark law basically forbids "trading on the good name" > of the trademark holder. Thus is it illegal to resell PG > eBooks without permission if you use the Project Gutenberg > name to do so. Project Gutengerg is a registered trademark. It appears that people are confusing using the registered trademark with acknowledging the source of the documents. They are not necessarily the same and perhaps Project Gutenberg could clarify what they consider trademark infringement and non-infringing acknowledgements. Regards, Walter -- Reasonable people adapt themselves to the world. Unreasonable people tries to adapt the world to themselves. All progress, therefore, depends on unreasonable people. (George Bernhard Shaw) From hart at pglaf.org Sun Sep 5 06:53:59 2004 From: hart at pglaf.org (Michael Hart) Date: Sun Sep 5 06:54:01 2004 Subject: [gutvol-d] Re: unauthorized PG venders (fwd) In-Reply-To: <1094186903.14077.8.camel@God> References: <1094186903.14077.8.camel@God> Message-ID: On Fri, 3 Sep 2004, Walter H. van Holst wrote: > On Thu, 2004-09-02 at 17:39, Michael Hart wrote: > >> Trademark law basically forbids "trading on the good name" >> of the trademark holder. Thus is it illegal to resell PG >> eBooks without permission if you use the Project Gutenberg >> name to do so. Project Gutengerg is a registered trademark. > > It appears that people are confusing using the registered trademark with > acknowledging the source of the documents. They are not necessarily the > same and perhaps Project Gutenberg could clarify what they consider > trademark infringement and non-infringing acknowledgements. You can't resell IBM computers as new without trademark infringement, even if they are genuine IBM computers: this is called "gray marketing." IBM has the right to have only licensed vendors sell their products, the same goes every other trademark. If you haven't requested and received permission, you are not licensed, and thus would be infringing on the trademark. This is not true of used goods, as far as I know. mh From cannona at fireantproductions.com Sun Sep 5 13:45:15 2004 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Sep 5 13:46:55 2004 Subject: [gutvol-d] an interesting but complicated source for audio books Message-ID: <6.1.2.0.0.20040905153417.02423ac0@mail.fireantproductions.com> I had an idea today on how we might be able to substantially increase our collection of human read audio books. The National Library Service for the Blind and Physically Handicapped (a division of the LOC) has recorded thousands of titles. They are (obviously) a governmental organization. As such, the audio recordings would not be copyrighted. The only part of the production which may be copyrighted would be the underlying books which were read. So, let's say that, for example, they have the book Dracula. However, the book which they read from was copyright 1981. Would it still be possible to add the audio book to the collection if the text in our public domain version was so similar to the audio recording as to be indistinguishable? Like I said before, I realize that this is a complicated question, but I feel it's worth investigating, as it could add great value to our collection. Sincerely Aaron Cannon -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From gbnewby at pglaf.org Sun Sep 5 14:34:45 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Sep 5 14:34:47 2004 Subject: [gutvol-d] gutvol-d list moved In-Reply-To: References: <20040905165550.GA540487@mind> Message-ID: <20040905213445.GB1234@pglaf.org> On Sun, Sep 05, 2004 at 11:50:22AM -0600, Tom Hall wrote: > Never mind - I figured it out: > > I sent the "password" command to gutvol-d-request@lists.pglaf.org from > my subscribed address. > > Since I don't find your anouncement in any of my past messages, I'm > copying the list. > > - Tom If you visit http://lists.pglaf.org, there's an option at the bottom for list settings and to unsubscribe. It's not that obvious, since it's a busy page. Enter your email address, then on the next page there's an option to have your password emailed to you (yes, everyone got a *random* password...sorry this is a hassle, but I could not recover passwords from Lyris, and prefer a random password to no password [without a password, someone could easily post a message masquerading as you]). I think I set the list to email a password reminder on the 1st day of every month. I'm not sure how welcome this will be for our lists (some do it, some don't). Advice would be welcome! -- Greg > On Sun, Sep 05, 2004 at 10:55:51AM -0600, Tom Hall wrote: > > How do we discover or set our passwords to allow access to > > http://lists.pglaf.org ? > > > > On Sun, Aug 08, 2004 at 01:02:28PM -0700, Greg Newby wrote: > > > As long announced, I've moved gutvol-d from listserv.unc.edu > > > to lists.pglaf.org > > > > > > Visit http://lists.pglaf.org if you want to confirm > > > your subscription settings, get on other lists, > > > etc. > > > > > > I'm sending similar announcements to our other lists, so > > > I apologize in advance that some of you will see multiple > > > messages. > > > -- Greg > > > > > > --- > > > For subscription help visit http://listserv.unc.edu > > --- > For subscription help visit http://listserv.unc.edu From gbnewby at pglaf.org Sun Sep 5 18:40:11 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Sep 5 18:40:13 2004 Subject: [gutvol-d] an interesting but complicated source for audio books In-Reply-To: <6.1.2.0.0.20040905153417.02423ac0@mail.fireantproductions.com> References: <6.1.2.0.0.20040905153417.02423ac0@mail.fireantproductions.com> Message-ID: <20040906014011.GB5407@pglaf.org> On Sun, Sep 05, 2004 at 03:45:15PM -0500, Aaron Cannon wrote: > I had an idea today on how we might be able to substantially increase our > collection of human read audio books. The National Library Service for the > Blind and Physically Handicapped (a division of the LOC) has recorded > thousands of titles. They are (obviously) a governmental organization. As > such, the audio recordings would not be copyrighted. The only part of the > production which may be copyrighted would be the underlying books which > were read. > > So, let's say that, for example, they have the book Dracula. However, the > book which they read from was copyright 1981. Would it still be possible > to add the audio book to the collection if the text in our public domain > version was so similar to the audio recording as to be indistinguishable? > > Like I said before, I realize that this is a complicated question, but I > feel it's worth investigating, as it could add great value to our > collection. Hi, Aaron. This is a good idea! I would want to do a little more research to make sure there is no copyright on the audio performances, but it sounds like they're probably public domain in the US. As long as a public domain performance matches a public domain text, we could clear it. -- Greg From tb at baechler.net Tue Sep 7 22:54:27 2004 From: tb at baechler.net (Tony Baechler) Date: Tue Sep 7 22:54:38 2004 Subject: [gutvol-d] xml to Braille translator Message-ID: <5.2.0.9.0.20040907225113.0279d3a0@snoopy2.trkhosting.com> Hello. My understanding is that eventually PG will switch to some form of xml for all the ebooks, so the below package might be useful for making Braille embosser ready files for the blind. The only problem might be with creating the semantic files it requires, but if PG is using a standard xml format anyway, this would only have to be done a few times. Computers to Hepp People, Inc. is proud to announce release 0.2 of the xml2brl program, which translates files in xml or plain text into brf files suitable for direct printing on a braille embosser. It now handles some math. Below is the README file for the program, which gives details of usage. To download the program, go to www.chpi.org/whatsnew.html Note that I am not including the somewhat long readme file here. From marcello at perathoner.de Fri Sep 10 09:41:00 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Sep 10 09:41:10 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] Message-ID: <4141D91C.2060808@perathoner.de> Anybody fluent enough in French to answer this? I think he wants to set links to our pages. No problem. He should use the canonical link form: www.gutenberg.net/etext/12345 -------- Original Message -------- Date: Fri, 10 Sep 2004 09:46:59 +0200 To: webmaster@gutenberg.net From: zemirline_mohamed Subject: Demande de liens pour pages d'auteurs Monsieur, J'ai l'Honneur de vous demander de bien vouloir m'autoriser d'avoir des liens directs ? quelques livres ?lectroniques de votre site. En effet, je pr?pare un petit site pour les livres gratuits et votre aide me serait d'une tr?s grande utilit?. Pour l'instant j'ai un petit site pour les photos : "Tibhirine ou La Nature": http://membres.lycos.fr/tibhirine Veuillez croire Monsieur, en mon enti?re reconnaissance. zemirline_mohamed@yahoo.fr M.Zemirline -- Marcello Perathoner webmaster@gutenberg.net From joshua at hutchinson.net Fri Sep 10 10:16:19 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Sep 10 10:16:33 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] Message-ID: <20040910171619.950AC9E8C8@ws6-2.us4.outblaze.com> Sorry, I tried to run it through BabelFish to see if I was remembering my old French classes correctly. I came away more confused by BabelFish's english translation than I was trying to puzzle out the French! Josh ----- Original Message ----- From: Marcello Perathoner Date: Fri, 10 Sep 2004 18:41:00 +0200 To: Project Gutenberg volunteer discussion Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] > Anybody fluent enough in French to answer this? > > I think he wants to set links to our pages. No problem. He should use > the canonical link form: www.gutenberg.net/etext/12345 > > > > > -------- Original Message -------- > Date: Fri, 10 Sep 2004 09:46:59 +0200 > To: webmaster@gutenberg.net > From: zemirline_mohamed > Subject: Demande de liens pour pages d'auteurs > > > Monsieur, > > J'ai l'Honneur de vous demander de bien vouloir m'autoriser d'avoir > des liens directs > ? quelques livres ?lectroniques de votre site. > En effet, je pr?pare un petit site pour les livres gratuits et votre aide > me serait d'une tr?s > grande utilit?. > Pour l'instant j'ai un petit site pour les photos : "Tibhirine ou La > Nature": > http://membres.lycos.fr/tibhirine > Veuillez croire Monsieur, en mon enti?re reconnaissance. > > zemirline_mohamed@yahoo.fr > M.Zemirline > > > > > -- > Marcello Perathoner > webmaster@gutenberg.net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hacker at gnu-designs.com Fri Sep 10 10:23:50 2004 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Fri Sep 10 10:24:32 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <20040910171619.950AC9E8C8@ws6-2.us4.outblaze.com> References: <20040910171619.950AC9E8C8@ws6-2.us4.outblaze.com> Message-ID: > Sorry, I tried to run it through BabelFish to see if I was > remembering my old French classes correctly. I came away more > confused by BabelFish's english translation than I was trying to > puzzle out the French! Try InterTran[1] or WorldLingo[2], they are MUCH MUCH better than Babelfish at translating actual human speech, instead of word-by-word lookups. [1] http://www.tranexp.com:2000/Translate/result.shtml [2] http://www.worldlingo.com/wl/Translate d. From traverso at dm.unipi.it Fri Sep 10 10:47:23 2004 From: traverso at dm.unipi.it (Carlo Traverso) Date: Fri Sep 10 10:47:34 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <4141D91C.2060808@perathoner.de> (message from Marcello Perathoner on Fri, 10 Sep 2004 18:41:00 +0200) References: <4141D91C.2060808@perathoner.de> Message-ID: <200409101747.i8AHlNps021777@posso.dm.unipi.it> >>>>> "Marcello" == Marcello Perathoner writes: Marcello> Anybody fluent enough in French to answer this? Marcello> I think he wants to set links to our pages. No Marcello> problem. He should use the canonical link form: Marcello> www.gutenberg.net/etext/12345 Yes, he asks the permission to have direct links to PG books. The PG policy should be stated in some obvious place, accessible from the front page. We have a robots link and not a link for "terms and conditions". We should state there how to link to PG books; for example, is linking to www.gutenberg.net/etext/12345/12345-h/12345.html permitted/forbidden/discouraged? So, should we answer "yes", or "yes, but", or "no, but", and on which basis? The exact question he poses is "I am honored to ask you to be willing to authorize me to have direct links to some ebooks in your site". The rest is a presentation of his site. Carlo Marcello> -------- Original Message -------- Date: Fri, 10 Sep Marcello> 2004 09:46:59 +0200 To: webmaster@gutenberg.net From: Marcello> zemirline_mohamed Subject: Marcello> Demande de liens pour pages d'auteurs Marcello> Monsieur, Marcello> J'ai l'Honneur de vous demander de bien vouloir Marcello> m'autoriser d'avoir des liens directs à quelques livres Marcello> électroniques de votre site. En effet, je prépare un Marcello> petit site pour les livres gratuits et votre aide me Marcello> serait d'une très grande utilité. Pour l'instant j'ai Marcello> un petit site pour les photos : "Tibhirine ou La Marcello> Nature": http://membres.lycos.fr/tibhirine Veuillez Marcello> croire Monsieur, en mon entière reconnaissance. Marcello> zemirline_mohamed@yahoo.fr M.Zemirline Marcello> -- Marcello Perathoner webmaster@gutenberg.net Marcello> _______________________________________________ gutvol-d Marcello> mailing list gutvol-d@lists.pglaf.org Marcello> http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Fri Sep 10 15:31:54 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Sep 10 15:31:55 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <200409101747.i8AHlNps021777@posso.dm.unipi.it> References: <4141D91C.2060808@perathoner.de> <200409101747.i8AHlNps021777@posso.dm.unipi.it> Message-ID: <20040910223154.GA30327@pglaf.org> On Fri, Sep 10, 2004 at 07:47:23PM +0200, Carlo Traverso wrote: > >>>>> "Marcello" == Marcello Perathoner writes: > > Marcello> Anybody fluent enough in French to answer this? > > Marcello> I think he wants to set links to our pages. No > Marcello> problem. He should use the canonical link form: > Marcello> www.gutenberg.net/etext/12345 > > > Yes, he asks the permission to have direct links to PG books. The PG > policy should be stated in some obvious place, accessible from the > front page. We have a robots link and not a link for "terms and > conditions". We should state there how to link to PG books; for > example, is linking to www.gutenberg.net/etext/12345/12345-h/12345.html > permitted/forbidden/discouraged? So, should we answer "yes", or "yes, > but", or "no, but", and on which basis? The exact question he poses is > > "I am honored to ask you to be willing to authorize me to have direct > links to some ebooks in your site". > > The rest is a presentation of his site. Here's a little blurb I send out, but we don't have it on the Web pages anywhere: Thanks for taking the time to request permission for a link to Project Gutenberg, as below. However, no such permission is required, and we do believe it is a poor precedent for Project Gutenberg to grant it. As you probably know, there have been a few legal attempts to block linking, especially "deep" linking. At least one legal case, in the UK, was won and one organization was subsequently blocked from deep linking to another. However, in the US there is neither a body of case law nor any state or federal regulations that we are aware of that would require permission to set up a hyperlink. Project Gutenberg does not encourage deep linking to our Web site and, in some cases, has actively discouraged it. But a link to the main page, http://www.gutenberg.net, would be most welcome, and will help to distribute our free electronic texts. Project Gutenberg has no interest or desire to grant permission to any site linking to us, as it would be a serious burden to our volunteer staff and set a dangerous precedent from a legal viewpoint. Thus, we would like to encourage you to link to Project Gutenberg. In addition, would be thrilled for you to download all 15000+ eBooks for your internal use (within the licensing restrictions you will find at the top of each eBook, see http://gutenberg.net/license). Also, note that it is our policy to not provide hyperlinks, ads or other materials on our Web site or in our FTP collection, except as it pertains to the collection itself. In other words, we will not reciprocate with a link to your site, or mention of your site. Finally, since you are interested in free etexts, we would like to invite you to visit our Volunteer's area at http://gutenberg.net to see how you or your organization could help in bringing great literature to the world without cost. Again, thanks for your interest in Project Gutenberg. > Marcello> -------- Original Message -------- Date: Fri, 10 Sep > Marcello> 2004 09:46:59 +0200 To: webmaster@gutenberg.net From: > Marcello> zemirline_mohamed Subject: > Marcello> Demande de liens pour pages d'auteurs > > > Marcello> Monsieur, > > Marcello> J'ai l'Honneur de vous demander de bien vouloir > Marcello> m'autoriser d'avoir des liens directs ? quelques livres > Marcello> ?lectroniques de votre site. En effet, je pr?pare un > Marcello> petit site pour les livres gratuits et votre aide me > Marcello> serait d'une tr?s grande utilit?. Pour l'instant j'ai > Marcello> un petit site pour les photos : "Tibhirine ou La > Marcello> Nature": http://membres.lycos.fr/tibhirine Veuillez > Marcello> croire Monsieur, en mon enti?re reconnaissance. > > Marcello> zemirline_mohamed@yahoo.fr M.Zemirline > > > > > Marcello> -- Marcello Perathoner webmaster@gutenberg.net > > Marcello> _______________________________________________ gutvol-d > Marcello> mailing list gutvol-d@lists.pglaf.org > Marcello> http://lists.pglaf.org/listinfo.cgi/gutvol-d > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From marcello at perathoner.de Sat Sep 11 08:07:26 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat Sep 11 08:07:58 2004 Subject: [gutvol-d] gutvol-d list does not set Reply-To header Message-ID: <414314AE.6060004@perathoner.de> Replies to posts will only get to the originator and not to the list. -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Sat Sep 11 08:08:04 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat Sep 11 08:08:36 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <20040910223154.GA30327@pglaf.org> References: <4141D91C.2060808@perathoner.de> <200409101747.i8AHlNps021777@posso.dm.unipi.it> <20040910223154.GA30327@pglaf.org> Message-ID: <414314D4.7090100@perathoner.de> Greg Newby wrote: > Project Gutenberg does not encourage deep linking to our Web site and, > in some cases, has actively discouraged it. But a link to the main > page, http://www.gutenberg.net, would be most welcome, and will help > to distribute our free electronic texts. This, I think, does not reflect "common usage". There are millions of deep links into the PG site, directly to the files or into the old (Pietro's) search page. Many of them are found inside newsgroups, blogs and reviews where changing them is impossible. I am trying to keep those old links working with redirects. We should (and I have done so for some time) encourage links to the bibrec page in the canonical www.gutenberg.net/etext/12345 form instead of to the files. We also have a canonical www.gutenberg.net/author/Mark_Twain url that gives you all books by that author. This url is used on the wikipedia pages to get a current list of books vs. the often stale and uncomplete edited-by-hand lists. (I myself put in many of them.) Any user that gets on one of those pages can easily navigate to the root page, so it is not necessary to require deep linkers to also set a link to the root. Deep linking to the files, while harmless, is less effective than linking to the bibrec page. The user may not be aware that there are more formats to choose from, he may not be aware that there are newer versions and she may not see the huge amount of other material we have. Bottom line: - we should allow deep links to /etext/12345 and /author/Mark_Twain - we should discourage deep links to the files And further on, we should get our subject cataloging up to date so we can offer an url like /subject/Mystery. -- Marcello Perathoner webmaster@gutenberg.net From gbnewby at pglaf.org Sat Sep 11 12:55:57 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sat Sep 11 12:55:59 2004 Subject: FW: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: References: Message-ID: <20040911195557.GC19751@pglaf.org> (Forwarding to the list). On Fri, Sep 10, 2004 at 08:13:34PM -0500, John Hagerson wrote: > Many sites demand notification for any link (even to the home page). Because > PG specifically does NOT want to be asked, maybe it would be helpful to put > a short notice along the lines of "you are welcome to link to our home page; > please don't tax our small, volunteer staff by telling us that you plan to > do so; while we can't stop you, we would consider it impolite if you linked > to any other page on our site other than the home page." > > Just a thought. I'm a software engineer, not a lawyer. > > -----Original Message----- > From: gutvol-d-bounces@lists.pglaf.org > [mailto:gutvol-d-bounces@lists.pglaf.org] On Behalf Of Greg Newby > Sent: Friday, September 10, 2004 5:32 PM > To: gutvol-d@lists.pglaf.org > Subject: Re: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] > > On Fri, Sep 10, 2004 at 07:47:23PM +0200, Carlo Traverso wrote: > > >>>>> "Marcello" == Marcello Perathoner writes: > > > > Marcello> Anybody fluent enough in French to answer this? > > > > Marcello> I think he wants to set links to our pages. No > > Marcello> problem. He should use the canonical link form: > > Marcello> www.gutenberg.net/etext/12345 > > > > > > Yes, he asks the permission to have direct links to PG books. The PG > > policy should be stated in some obvious place, accessible from the > > front page. We have a robots link and not a link for "terms and > > conditions". We should state there how to link to PG books; for > > example, is linking to www.gutenberg.net/etext/12345/12345-h/12345.html > > permitted/forbidden/discouraged? So, should we answer "yes", or "yes, > > but", or "no, but", and on which basis? The exact question he poses is > > > > "I am honored to ask you to be willing to authorize me to have direct > > links to some ebooks in your site". > > > > The rest is a presentation of his site. > > Here's a little blurb I send out, but we don't have it on the Web pages > anywhere: > > > Thanks for taking the time to request permission for a link to Project > Gutenberg, as below. However, no such permission is required, and we > do believe it is a poor precedent for Project Gutenberg to grant it. > > As you probably know, there have been a few legal attempts to block > linking, especially "deep" linking. At least one legal case, in the > UK, was won and one organization was subsequently blocked from deep > linking to another. > > However, in the US there is neither a body of case law nor any state > or federal regulations that we are aware of that would require > permission to set up a hyperlink. > > Project Gutenberg does not encourage deep linking to our Web site and, > in some cases, has actively discouraged it. But a link to the main > page, http://www.gutenberg.net, would be most welcome, and will help > to distribute our free electronic texts. > > Project Gutenberg has no interest or desire to grant permission to any > site linking to us, as it would be a serious burden to our volunteer > staff and set a dangerous precedent from a legal viewpoint. > > Thus, we would like to encourage you to link to Project Gutenberg. In > addition, would be thrilled for you to download all 15000+ eBooks for > your internal use (within the licensing restrictions you will find at > the top of each eBook, see http://gutenberg.net/license). > > Also, note that it is our policy to not provide hyperlinks, ads > or other materials on our Web site or in our FTP collection, except > as it pertains to the collection itself. In other words, we will > not reciprocate with a link to your site, or mention of your site. > > Finally, since you are interested in free etexts, we would like > to invite you to visit our Volunteer's area at http://gutenberg.net > to see how you or your organization could help in bringing great > literature to the world without cost. > > Again, thanks for your interest in Project Gutenberg. > > > > > Marcello> -------- Original Message -------- Date: Fri, 10 Sep > > Marcello> 2004 09:46:59 +0200 To: webmaster@gutenberg.net From: > > Marcello> zemirline_mohamed Subject: > > Marcello> Demande de liens pour pages d'auteurs > > > > > > Marcello> Monsieur, > > > > Marcello> J'ai l'Honneur de vous demander de bien vouloir > > Marcello> m'autoriser d'avoir des liens directs ? quelques livres > > Marcello> ?lectroniques de votre site. En effet, je pr?pare un > > Marcello> petit site pour les livres gratuits et votre aide me > > Marcello> serait d'une tr?s grande utilit?. Pour l'instant j'ai > > Marcello> un petit site pour les photos : "Tibhirine ou La > > Marcello> Nature": http://membres.lycos.fr/tibhirine Veuillez > > Marcello> croire Monsieur, en mon enti?re reconnaissance. > > > > Marcello> zemirline_mohamed@yahoo.fr M.Zemirline > > > > > > > > > > Marcello> -- Marcello Perathoner webmaster@gutenberg.net > > > > Marcello> _______________________________________________ gutvol-d > > Marcello> mailing list gutvol-d@lists.pglaf.org > > Marcello> http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From ke at gnu.franken.de Sat Sep 11 13:53:12 2004 From: ke at gnu.franken.de (Karl Eichwalder) Date: Sat Sep 11 13:47:49 2004 Subject: FW: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <20040911195557.GC19751@pglaf.org> (Greg Newby's message of "Sat, 11 Sep 2004 12:55:57 -0700") References: <20040911195557.GC19751@pglaf.org> Message-ID: Greg Newby writes: > (Forwarding to the list). You replied :) > On Fri, Sep 10, 2004 at 08:13:34PM -0500, John Hagerson wrote: >> Many sites demand notification for any link (even to the home page). If they don't want traffic (to be be linked) they must not publish web pages. Deep linking is another issue. It is argueable whether you are allowed to link to special subframe or display foreign contents as part of your frameset. -- | ,__o | _-\_<, http://www.gnu.franken.de/ke/ | (*)/'(*) From traverso at dm.unipi.it Sat Sep 11 14:17:25 2004 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sat Sep 11 14:17:43 2004 Subject: FW: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: (message from Karl Eichwalder on Sat, 11 Sep 2004 22:53:12 +0200) References: <20040911195557.GC19751@pglaf.org> Message-ID: <200409112117.i8BLHP6Z009559@posso.dm.unipi.it> I see from PG pages (FAQ #0): The mission of Project Gutenberg is simple: "To encourage the creation and distribution of eBooks." This mission is, as much as possible, to encourage *ALL* those who are interested in making eBooks and helping to give them away. A link to a book is an help to give it away. Discouraging direct linking to the download pages of an individual book is IMHO contrary to PG mission, since it makes harder to find and download the book. Why should we force people that want a precise book to query our database (use our respurces) and possibly miss the book, when from the book downloading page they have links to the rest of the site? Carlo From pm002c3918 at blueyonder.co.uk Fri Sep 10 10:56:32 2004 From: pm002c3918 at blueyonder.co.uk (Miranda van de Heijning) Date: Sat Sep 11 14:55:08 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] References: <20040910171619.950AC9E8C8@ws6-2.us4.outblaze.com> Message-ID: <000e01c4975f$821b96f0$0302a8c0@PAULANDMIRANDA> Hi all, It says: "I have the honor of asking you whether you could allow me to put direct links to some of the e-texts on your site. Basically, I have prepared a little site with free books and your help would be of great use. For example, I have a little site with pictures Tibhirine ou La Nature": http://membres.lycos.fr/tibhirine . Please count on my full recognition (?--I think this either means he will acknowledge the source or that he will be grateful for the links)" Kind regards, Miranda van de Heijning J'ai l'Honneur de vous demander de bien vouloir m'autoriser d'avoir > des liens directs > ? quelques livres ?lectroniques de votre site. > En effet, je pr?pare un petit site pour les livres gratuits et votre aide > me serait d'une tr?s > grande utilit?. > Pour l'instant j'ai un petit site pour les photos : "Tibhirine ou La > Nature": > http://membres.lycos.fr/tibhirine > Veuillez croire Monsieur, en mon enti?re reconnaissance. > > zemirline_mohamed@yahoo.fr > M.Zemirline ----- Original Message ----- From: "Joshua Hutchinson" To: "Project Gutenberg volunteer discussion" Sent: Friday, September 10, 2004 6:16 PM Subject: Re: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] Sorry, I tried to run it through BabelFish to see if I was remembering my old French classes correctly. I came away more confused by BabelFish's english translation than I was trying to puzzle out the French! Josh ----- Original Message ----- From: Marcello Perathoner Date: Fri, 10 Sep 2004 18:41:00 +0200 To: Project Gutenberg volunteer discussion Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] > Anybody fluent enough in French to answer this? > > I think he wants to set links to our pages. No problem. He should use > the canonical link form: www.gutenberg.net/etext/12345 > > > > > -------- Original Message -------- > Date: Fri, 10 Sep 2004 09:46:59 +0200 > To: webmaster@gutenberg.net > From: zemirline_mohamed > Subject: Demande de liens pour pages d'auteurs > > > Monsieur, > > J'ai l'Honneur de vous demander de bien vouloir m'autoriser d'avoir > des liens directs > ? quelques livres ?lectroniques de votre site. > En effet, je pr?pare un petit site pour les livres gratuits et votre aide > me serait d'une tr?s > grande utilit?. > Pour l'instant j'ai un petit site pour les photos : "Tibhirine ou La > Nature": > http://membres.lycos.fr/tibhirine > Veuillez croire Monsieur, en mon enti?re reconnaissance. > > zemirline_mohamed@yahoo.fr > M.Zemirline > > > > > -- > Marcello Perathoner > webmaster@gutenberg.net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Sat Sep 11 15:08:59 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sat Sep 11 15:09:00 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <414314D4.7090100@perathoner.de> References: <4141D91C.2060808@perathoner.de> <200409101747.i8AHlNps021777@posso.dm.unipi.it> <20040910223154.GA30327@pglaf.org> <414314D4.7090100@perathoner.de> Message-ID: <20040911220859.GC22304@pglaf.org> On Sat, Sep 11, 2004 at 05:08:04PM +0200, Marcello Perathoner wrote: > Greg Newby wrote: > > >Project Gutenberg does not encourage deep linking to our Web site and, > >in some cases, has actively discouraged it. But a link to the main > >page, http://www.gutenberg.net, would be most welcome, and will help > >to distribute our free electronic texts. > > This, I think, does not reflect "common usage". There are millions of > deep links into the PG site, directly to the files or into the old > (Pietro's) search page. I disagree. Common usage at content sites, such as online newspapers, is for links directly to content to work for awhile, then to stop working as part of the document management process. But this is not an important disagreement, because I think we will agree on the rest: > Many of them are found inside newsgroups, blogs and reviews where > changing them is impossible. I am trying to keep those old links working > with redirects. > > We should (and I have done so for some time) encourage links to the > bibrec page in the canonical www.gutenberg.net/etext/12345 form instead > of to the files. Agreed. But the bibrec page has only existed for a few months. Writing a linking policy that says to link to the canonical etext/xxx location would help. > We also have a canonical www.gutenberg.net/author/Mark_Twain url that > gives you all books by that author. This url is used on the wikipedia > pages to get a current list of books vs. the often stale and uncomplete > edited-by-hand lists. (I myself put in many of them.) I didn't know about this one. Another reason to write a linking policy (you are guessing who I will nominate to write it, yes?). > Any user that gets on one of those pages can easily navigate to the root > page, so it is not necessary to require deep linkers to also set a link > to the root. Yes. > Deep linking to the files, while harmless, is less effective than > linking to the bibrec page. The user may not be aware that there are > more formats to choose from, he may not be aware that there are newer > versions and she may not see the huge amount of other material we have. Deep linking to the bibrec is harmless. Deep linking to a particular eBook file (especially in the etext?? dirs) is perilous. > Bottom line: > > - we should allow deep links to /etext/12345 and /author/Mark_Twain Yes. > - we should discourage deep links to the files Yes. That is what the earlier policy I sent was talking about; we didn't have the bibrec pages when I wrote it. > And further on, we should get our subject cataloging up to date so we > can offer an url like /subject/Mystery. Absolutely. If this all seems agreeable to people, we just need to write a "linking HOWTO" or somesuch. As others have said, we *DO* want people to link to us. But we *DON'T* believe people need to ask permission to do so (and I have on several occasions refused to fill out stupid forms some organizations send, asking permission to link to us). We *DO* want people to link to the bibrec pages, author pages, etc. But we *DON'T* want people linking directly to eBook files, because their locations can change. (This is less an issue with the post-10K directory structure, which is far more regular....but since we still have tens of thousands of files in the old directory structure, we should just discourage linking to files directly.) BTW, once we get some sort of conversion on the fly going, I expect URLs like this: http://gutenberg.net/etext/1234/1234.txt http://gutenberg.net/etext/1234/1234.pdf http://gutenberg.net/etext/1234/1234.mp3?maxsize=10m&speed=2x http://gutenberg.net/etext/1234/1234.htm?css=blueplaid&font=verdana and the like...where files we already have are delivered, but files we don't have are generated. I don't think I'd recommend people link directly to such URLs because (a) they depend on converters that might not always be available, or might require additional syntax, and (b) because such a URL is fine for individual use, but if you tell your friend such a URL then their choices for alternate formats are less likely to be evident to them. Further thoughts? -- Greg From joshua at hutchinson.net Sat Sep 11 19:45:33 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Sat Sep 11 19:45:43 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <20040911220859.GC22304@pglaf.org> References: <4141D91C.2060808@perathoner.de> <200409101747.i8AHlNps021777@posso.dm.unipi.it> <20040910223154.GA30327@pglaf.org> <414314D4.7090100@perathoner.de> <20040911220859.GC22304@pglaf.org> Message-ID: <4143B84D.6060707@hutchinson.net> Greg Newby wrote: >If this all seems agreeable to people, we just need to write >a "linking HOWTO" or somesuch. > All the examples you gave sound perfect to me. Basically, someone (not me) write it up and post it off the main page and let's move on! :-) Next problem! hehe Josh From marcello at perathoner.de Sun Sep 12 05:56:04 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Sep 12 05:56:34 2004 Subject: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <20040911220859.GC22304@pglaf.org> References: <4141D91C.2060808@perathoner.de> <200409101747.i8AHlNps021777@posso.dm.unipi.it> <20040910223154.GA30327@pglaf.org> <414314D4.7090100@perathoner.de> <20040911220859.GC22304@pglaf.org> Message-ID: <41444764.4020407@perathoner.de> Greg Newby wrote: >>>Project Gutenberg does not encourage deep linking to our Web site and, >>>in some cases, has actively discouraged it. But a link to the main >>>page, http://www.gutenberg.net, would be most welcome, and will help >>>to distribute our free electronic texts. >> >>This, I think, does not reflect "common usage". There are millions of >>deep links into the PG site, directly to the files or into the old >>(Pietro's) search page. > > I disagree. Common usage at content sites, such as online > newspapers, is for links directly to content to work for awhile, > then to stop working as part of the document management process. I meant common usage for PG and sites linking to PG. > I didn't know about this one. Another reason to write a linking > policy (you are guessing who I will nominate to write it, yes?). I will write one ... later. -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Sun Sep 12 08:40:35 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Sep 12 08:41:11 2004 Subject: [gutvol-d] Linking to PG policy Message-ID: <41446DF3.1010104@perathoner.de> Wrote a page about linking policy. http://www.gutenberg.net/howto-link That leaves us with the question of how to deal with the many "Project Gutenberg Multiplexer" sites that offer independent search facilities for PG books. Most of those do deep linking to the files. Requiring them to link to the bibrec page would be pretty much counterintuitive because their reason of existence is to provide a search interface of their own. Should we require them to prominently display a live link to our main page along with any search results? -- Marcello Perathoner webmaster@gutenberg.net From hart at pglaf.org Sun Sep 12 09:04:22 2004 From: hart at pglaf.org (Michael Hart) Date: Sun Sep 12 09:04:23 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <41446DF3.1010104@perathoner.de> References: <41446DF3.1010104@perathoner.de> Message-ID: On Sun, 12 Sep 2004, Marcello Perathoner wrote: > Wrote a page about linking policy. > > http://www.gutenberg.net/howto-link > > > That leaves us with the question of how to deal with the many "Project > Gutenberg Multiplexer" sites that offer independent search facilities for PG > books. Most of those do deep linking to the files. > > Requiring them to link to the bibrec page would be pretty much > counterintuitive because their reason of existence is to provide a search > interface of their own. > > Should we require them to prominently display a live link to our main page > along with any search results? I think that would probably be acceptable to most. Leave as many options open as possible. Michael From joshua at hutchinson.net Sun Sep 12 09:23:16 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Sun Sep 12 09:23:15 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <41446DF3.1010104@perathoner.de> References: <41446DF3.1010104@perathoner.de> Message-ID: <414477F4.6080200@hutchinson.net> I think we should just encourage linkers to link to the bibrec pages and basically ignore the links directly to the texts. If the texts change location (like the pre 10k will), then those links will break and oh, well, sorry. We never said those were good link spots. My humble opinion, of course. Josh Marcello Perathoner wrote: > Should we require them to prominently display a live link to our main > page along with any search results? > From sly at victoria.tc.ca Sat Sep 11 23:47:11 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Sep 12 12:48:27 2004 Subject: FW: [gutvol-d] [Fwd: Demande de liens pour pages d'auteurs] In-Reply-To: <200409112117.i8BLHP6Z009559@posso.dm.unipi.it> References: <20040911195557.GC19751@pglaf.org> <200409112117.i8BLHP6Z009559@posso.dm.unipi.it> Message-ID: Yes. As others have said, linking to the bibrec file is the best solution for many situations out there. A few months ago, I spent many hours searching and cross-referencing to add links to PG titles in the English language Wikipedia. This seems to me like a perfect fit, as the ideal addition to an encyclopedia-type article is a link to where you can find more in-depth information on the topic. With the broadening of topics in PG (greatly thanks to DP) appropriate, relevent links can be added to Wikipedia not just for the obvious articles about famous books, but also biographies and auto-biographies, and books about famous events (such as the sinking of the Titanic.) I would argue that including links these not only raises the public profile of PG, but also shows people that PG does have books about the very topics they are interested in, not just a few old, musty classics. On Sat, 11 Sep 2004, Carlo Traverso wrote: > > I see from PG pages (FAQ #0): > > The mission of Project Gutenberg is simple: > > "To encourage the creation and distribution of eBooks." > > This mission is, as much as possible, to encourage *ALL* those who > are interested in making eBooks and helping to give them away. > > A link to a book is an help to give it away. > From gbnewby at pglaf.org Sun Sep 12 12:53:49 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Sep 12 12:53:52 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: References: <41446DF3.1010104@perathoner.de> Message-ID: <20040912195349.GA22689@pglaf.org> On Sun, Sep 12, 2004 at 09:04:22AM -0700, Michael Hart wrote: > > > On Sun, 12 Sep 2004, Marcello Perathoner wrote: > > >Wrote a page about linking policy. > > > > http://www.gutenberg.net/howto-link Outstanding! Thanks very much!! See below for an addition: > >That leaves us with the question of how to deal with the many "Project > >Gutenberg Multiplexer" sites that offer independent search facilities for > >PG books. Most of those do deep linking to the files. > > > >Requiring them to link to the bibrec page would be pretty much > >counterintuitive because their reason of existence is to provide a search > >interface of their own. > > > >Should we require them to prominently display a live link to our main page > >along with any search results? I would not say "require," but "request." Or "suggest." > I think that would probably be acceptable to most. > > Leave as many options open as possible. Yes, I agree. But let's add a link to http://www.gutenberg.net/findalternate and a mention that people could use our own database to have correct & updated information about our eBooks. The RDF/XML catalog is a wonderful resource for such sites, but still new enough that most places probably don't know about it. ** In fact, let's run a little blurb in the upcoming newsletter: "Database available: Visit http://www.gutenberg.net/findalternate for the Project Gutenberg catalog database in XML format. This could be useful for people developing their own search interface to Project Gutenberg, or to people wishing to track all the latest files." From traverso at dm.unipi.it Sun Sep 12 16:13:59 2004 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sun Sep 12 16:14:30 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <414477F4.6080200@hutchinson.net> (message from Joshua Hutchinson on Sun, 12 Sep 2004 12:23:16 -0400) References: <41446DF3.1010104@perathoner.de> <414477F4.6080200@hutchinson.net> Message-ID: <200409122313.i8CNDxcv003820@posso.dm.unipi.it> I think that the links accessed from the "In Depth Information and Volunteer Section" should be allowed too. Probably, the only discouraged links should be the direct links to books (for example DP has a lot of them in the "gold star" list, but it would be better to change...) Linking e.g. to a section of the FAQ is very useful. By the way, some links at PG are broken: e.g. the newsletters. Carlo From marcello at perathoner.de Mon Sep 13 02:16:09 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Sep 13 02:16:58 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <20040912195349.GA22689@pglaf.org> References: <41446DF3.1010104@perathoner.de> <20040912195349.GA22689@pglaf.org> Message-ID: <41456559.3000803@perathoner.de> Greg Newby wrote: > Yes, I agree. But let's add a link to > http://www.gutenberg.net/findalternate and a mention that people could > use our own database to have correct & updated information about our > eBooks. I've added a section re. independent search sites. -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Mon Sep 13 02:23:56 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Sep 13 02:24:43 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <200409122313.i8CNDxcv003820@posso.dm.unipi.it> References: <41446DF3.1010104@perathoner.de> <414477F4.6080200@hutchinson.net> <200409122313.i8CNDxcv003820@posso.dm.unipi.it> Message-ID: <4145672C.9020201@perathoner.de> Carlo Traverso wrote: > I think that the links accessed from the "In Depth Information and > Volunteer Section" should be allowed too. Probably, the only > discouraged links should be the direct links to books (for example DP > has a lot of them in the "gold star" list, but it would be better to > change...) > > Linking e.g. to a section of the FAQ is very useful. The policy is addressing the many "outside" people. They almost always want to link to some book. I'll see what I can do to soften the policy ... > By the way, some links at PG are broken: e.g. the newsletters. I know. I fixed a slew of them in the past but they keep going broken. You'll have to bug the newsletter editors. :-) -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Mon Sep 13 03:36:06 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Sep 13 03:37:04 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <200409122313.i8CNDxcv003820@posso.dm.unipi.it> References: <41446DF3.1010104@perathoner.de> <414477F4.6080200@hutchinson.net> <200409122313.i8CNDxcv003820@posso.dm.unipi.it> Message-ID: <41457816.6030001@perathoner.de> Carlo Traverso wrote: > I think that the links accessed from the "In Depth Information and > Volunteer Section" should be allowed too. Probably, the only > discouraged links should be the direct links to books (for example DP > has a lot of them in the "gold star" list, but it would be better to > change...) > > Linking e.g. to a section of the FAQ is very useful. I reworded some of the policy to only deprecate direct links to the files. -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Mon Sep 13 09:31:22 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Sep 13 09:31:56 2004 Subject: [gutvol-d] Top 100 Message-ID: <4145CB5A.1090604@perathoner.de> New experimental top 100 books and authors at: http://www.gutenberg.net/catalog/world/top Did you know our most read authors are Various and Anonymous? Did you know our most downloaded eBooks are: 1. Audio: "The House of Usher" by Edgar Allan Poe 2. Audio: "Bleak House" by Charles Dickens 3. Vanity Fair by William Makepeace Thackeray 4. Ulysses by James Joyce with 1. being downloaded 5 times as often as 2., and 9 times as often as 3. ? Yes, there is a solution to the mystery. Anybody wants to apply his/her reasoning power? The solution is a few lines down. It turns out I had to disqualify those as well as a few others that have mp3 files and use the word "House" in the title. (Moreover, research has shown that "Usher" is a rap artist. :-) -- Marcello Perathoner webmaster@gutenberg.net From hart at pglaf.org Mon Sep 13 09:50:07 2004 From: hart at pglaf.org (Michael Hart) Date: Mon Sep 13 09:50:09 2004 Subject: [gutvol-d] Top 100 In-Reply-To: <4145CB5A.1090604@perathoner.de> References: <4145CB5A.1090604@perathoner.de> Message-ID: I'm not sure I received the entire message below, since there are a number of blank lines and then not what I expected below them. On Mon, 13 Sep 2004, Marcello Perathoner wrote: > New experimental top 100 books and authors at: > > http://www.gutenberg.net/catalog/world/top > > > Did you know our most read authors are Various and Anonymous? > > Did you know our most downloaded eBooks are: > > 1. Audio: "The House of Usher" by Edgar Allan Poe > 2. Audio: "Bleak House" by Charles Dickens > 3. Vanity Fair by William Makepeace Thackeray > 4. Ulysses by James Joyce > > with 1. being downloaded 5 times as often as 2., and 9 times as often as 3. ? Was there supposed to be more to this list? If so, I think I missed the original posting of the Top 100, and searchin for "Top 100" didn't get to any recent messages. > Yes, there is a solution to the mystery. Anybody wants to apply his/her > reasoning power? > > The solution is a few lines down. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It turns out I had to disqualify those as well as a few others that have mp3 > files and use the word "House" in the title. (Moreover, research has shown > that "Usher" is a rap artist. :-) OK, perhaps I was just expecting something more serious here, but. . . . I would include all files, not sure why to disqualtify MP3 files, or "house" remixes. . .hee hee! I think we should measure everything, though I think sub-lists would be acceptable. . .such as the "Whole Top 100," then fiction, non-fiction, .txt files, .htm files, .mp3 files, etc., etc., etc., Michael From marcello at perathoner.de Mon Sep 13 10:00:44 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Sep 13 10:01:17 2004 Subject: [gutvol-d] Top 100 In-Reply-To: References: <4145CB5A.1090604@perathoner.de> Message-ID: <4145D23C.6060100@perathoner.de> Michael Hart wrote: > I would include all files, not sure why to disqualtify MP3 files, or > "house" > remixes. . .hee hee! Because those files were downloaded in error by people who wanted to have mp3 files with "House" music. If I didn't disqualify them they would sit in front of the top list forever, being downloaded nearly 10 times oftener than the next non-"House" eBook. -- Marcello Perathoner webmaster@gutenberg.net From hart at pglaf.org Mon Sep 13 10:05:08 2004 From: hart at pglaf.org (Michael Hart) Date: Mon Sep 13 10:05:09 2004 Subject: [gutvol-d] Top 100 In-Reply-To: <4145D23C.6060100@perathoner.de> References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de> Message-ID: On Mon, 13 Sep 2004, Marcello Perathoner wrote: > Michael Hart wrote: > >> I would include all files, not sure why to disqualtify MP3 files, or >> "house" >> remixes. . .hee hee! > > Because those files were downloaded in error by people who wanted to have mp3 > files with "House" music. > > If I didn't disqualify them they would sit in front of the top list forever, > being downloaded nearly 10 times oftener than the next non-"House" eBook. Wow!!! So these are people most likely useing WEBVCR programs to sweep up everything with "house" + "mp3" ??? Who'da thunk it??? Can you send me the Top 100 list[s]? Perhaps we can automate something to send me this, and we can put some notes in the Newsletter. . . . Thanks! Michael From gbnewby at pglaf.org Mon Sep 13 11:01:17 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Sep 13 11:01:18 2004 Subject: Fwd: Re: [gutvol-d] Linking to PG policy Message-ID: <20040913180117.GC14537@pglaf.org> Forwarding. ----- Forwarded message from Michael Hart ----- From: Michael Hart To: Greg Newby Subject: Re: [gutvol-d] Linking to PG policy Date: Mon, 13 Sep 2004 07:49:16 -0700 (PDT) Personally, I think that disallowing direct linking to our files will encourage traffice to other sites that do allow for click-throughs to our eBooks. . .such as Google, just to mention one sort of thing. You can forward to the list if you like. Michael ----- End forwarded message ----- From Bowerbird at aol.com Mon Sep 13 14:59:45 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Sep 13 15:00:02 2004 Subject: [gutvol-d] Linking to PG policy Message-ID: <8e.1505bbbc.2e777251@aol.com> there are a lot of deep links already out there, aren't there? it would be an unnecessary shame to break them, wouldn't it? -bowerbird From marcello at perathoner.de Tue Sep 14 00:34:37 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Sep 14 00:35:46 2004 Subject: Fwd: Re: [gutvol-d] Linking to PG policy In-Reply-To: <20040913180117.GC14537@pglaf.org> References: <20040913180117.GC14537@pglaf.org> Message-ID: <41469F0D.6060102@perathoner.de> Michael Hart wrote: > Personally, I think that disallowing direct linking to our files > will encourage traffice to other sites that do allow for click-throughs > to our eBooks. . .such as Google, just to mention one sort of thing. A deep link to an eBook file on ibiblio does not generate any traffic to the PG site. The user gets the file and does not see our site at all. -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Tue Sep 14 00:39:55 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Sep 14 00:41:01 2004 Subject: [gutvol-d] Linking to PG policy In-Reply-To: <8e.1505bbbc.2e777251@aol.com> References: <8e.1505bbbc.2e777251@aol.com> Message-ID: <4146A04B.4050808@perathoner.de> Bowerbird@aol.com wrote: > there are a lot of deep links already out there, aren't there? > > it would be an unnecessary shame to break them, wouldn't it? Nobody is breaking them on purpose. They break because the files get moved. Our policy directs users to set links to URLs that will not break. -- Marcello Perathoner webmaster@gutenberg.net From Bowerbird at aol.com Tue Sep 14 04:09:38 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Sep 14 04:10:04 2004 Subject: [gutvol-d] Linking to PG policy Message-ID: <103.4fb7bedc.2e782b72@aol.com> marcello said: > Nobody is breaking them on purpose. > They break because the files get moved. but the files are being moved on purpose, aren't they? so can't something be done? what about redirects? or nice little notes that tell the person where the file has been moved to? a coupon for $1 from see's candies? anything seems better than a 404. but perhaps there's something about the problem i don't understand. if that's the case, feel free to ignore me... :+) > Our policy directs users to set links to URLs that will not break. that's a good policy. people making new links should follow it. i'm just thinking about the links (who knows how many?) that are already out there that nobody is maintaining, or which couldn't be changed even if someone wanted to... -bowerbird From pm002c3918 at blueyonder.co.uk Mon Sep 13 10:54:00 2004 From: pm002c3918 at blueyonder.co.uk (Miranda van de Heijning) Date: Tue Sep 14 10:40:30 2004 Subject: [gutvol-d] Top 100 References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de> Message-ID: <003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> Hi, Out of curiosity, which period of time does the list cover? And does it update automatically or will this list be outdated after a while? And how many times was the number one book actually downloaded? And number 2? And number 3? Ah well, you get the picture. I am actually very curious after number 100, the Anatomy of Melancholy's exact figures, it being one of DP's pet projects. Per Michael Hart's suggestion, I would also love to have more figures per category, perhaps also per language etc. Not sure if they are difficult to generate, but I love stats and there couldn't be enough of those on gutenberg.net in my view. Marcello, thanks a lot for creating the list! Miranda ----- Original Message ----- From: "Marcello Perathoner" To: "Michael S. Hart" ; "Project Gutenberg Volunteer Discussion" Sent: Monday, September 13, 2004 6:00 PM Subject: Re: [gutvol-d] Top 100 > Michael Hart wrote: > > > I would include all files, not sure why to disqualtify MP3 files, or > > "house" > > remixes. . .hee hee! > > Because those files were downloaded in error by people who wanted to > have mp3 files with "House" music. > > If I didn't disqualify them they would sit in front of the top list > forever, being downloaded nearly 10 times oftener than the next > non-"House" eBook. > > > > -- > Marcello Perathoner > webmaster@gutenberg.net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > From marcello at perathoner.de Tue Sep 14 13:14:43 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Sep 14 13:15:40 2004 Subject: [gutvol-d] Top 100 In-Reply-To: <003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de> <003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> Message-ID: <41475133.2040908@perathoner.de> Miranda van de Heijning wrote: > Out of curiosity, which period of time does the list cover? And does it > update automatically or will this list be outdated after a while? Since Sep 03. It updates every night. > And how many times was the number one book actually downloaded? And number > 2? And number 3? Ah well, you get the picture. I am actually very curious > after number 100, the Anatomy of Melancholy's exact figures, it being one of > DP's pet projects. I have added the numbers. > Per Michael Hart's suggestion, I would also love to have more figures per > category, perhaps also per language etc. Not sure if they are difficult to > generate, but I love stats and there couldn't be enough of those on > gutenberg.net in my view. We'll see ... -- Marcello Perathoner webmaster@gutenberg.net From sly at victoria.tc.ca Tue Sep 14 15:34:46 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Sep 14 15:34:58 2004 Subject: [gutvol-d] Top 100 In-Reply-To: <41475133.2040908@perathoner.de> References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de> <003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> <41475133.2040908@perathoner.de> Message-ID: > Miranda van de Heijning wrote: > > > > Per Michael Hart's suggestion, I would also love to have more figures per > > category, perhaps also per language etc. Not sure if they are difficult to > > generate, but I love stats and there couldn't be enough of those on > > gutenberg.net in my view. I seem to recall seeing a page analyzing downloads from a PG ftp site. (can't remember which one.) It had a huge mass of statistics, including most often requested files, average files sizes, domains downloaded from, breakdown by file types, etc. If I look through my old emails, I may be able to find it... Andrew From marcello at perathoner.de Wed Sep 15 01:03:06 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Sep 15 01:04:41 2004 Subject: [gutvol-d] Top 100 In-Reply-To: References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de> <003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> <41475133.2040908@perathoner.de> Message-ID: <4147F73A.1000103@perathoner.de> Andrew Sly wrote: >>>Per Michael Hart's suggestion, I would also love to have more figures per >>>category, perhaps also per language etc. Not sure if they are difficult to >>>generate, but I love stats and there couldn't be enough of those on >>>gutenberg.net in my view. > > > I seem to recall seeing a page analyzing downloads from a PG ftp site. > (can't remember which one.) It had a huge mass of statistics, including > most often requested files, average files sizes, domains downloaded from, > breakdown by file types, etc. Start from: http://www.gutenberg.net/internal/stats/ user: books pass: internal there are stat files for the web page "pages" and the archive "files". Though the files page says ftp.ibiblio.org, all HTTP and FTP requests to the file archive are analyzed there. There are daily pages and a monthly page. Caveat emptor: many of the data there can be misleading if you dont know the tricks of the site. They are meant as a tool for me to watch if something goes terribly wrong, not as a download counting tool. -- Marcello Perathoner webmaster@gutenberg.net From cannona at fireantproductions.com Wed Sep 15 10:13:47 2004 From: cannona at fireantproductions.com (Aaron Cannon) Date: Wed Sep 15 10:28:46 2004 Subject: [gutvol-d] interesting stat from the news letter Message-ID: <6.1.2.0.0.20040915120920.01ed4e38@mail.fireantproductions.com> I usually skim the news letter to see if there is anything new, and I noticed this little stat: 352 Average Per Month in 2004 355 Average Per Month in 2003. Anyone have any theories as to why we are doing fewer books per month on average this year? Not complaining, just a little curious. Sorry if this is old news. Sincerely Aaron Cannon -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From fvandrog at scripps.edu Wed Sep 15 10:47:44 2004 From: fvandrog at scripps.edu (Frank van Drogen) Date: Wed Sep 15 10:47:51 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <6.1.2.0.0.20040915120920.01ed4e38@mail.fireantproductions. com> References: <6.1.2.0.0.20040915120920.01ed4e38@mail.fireantproductions.com> Message-ID: <6.1.2.0.0.20040915103948.01f94188@mail.scripps.edu> >I usually skim the news letter to see if there is anything new, and I >noticed this little stat: > 352 Average Per Month in 2004 > 355 Average Per Month in 2003. > >Anyone have any theories as to why we are doing fewer books per month on >average this year? Not complaining, just a little curious. I think one reason that DP produces a little less texts compared to approximately one year ago is the fact that more and more projects get HTML markup, which makes these texts a lot more usefull (IMHO), but reduces the output because people are still supposed to prepare the text version as well. So having to produce two versions, and the fact that adding markup, images, links between ToC and indices just takes more time reduces the quantity of the output. Having the HTML editions, however, definitely increases quality.... Frank >Sorry if this is old news. > > >Sincerely >Aaron Cannon > >-- >E-mail: cannona@fireantproductions.com >Skype: cannona >MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail >address.) > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Wed Sep 15 11:40:39 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Wed Sep 15 11:40:53 2004 Subject: [gutvol-d] interesting stat from the news letter Message-ID: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> DP produces about 200+ new books a month. Unfortunately, the proofers at DP, finish about 250 books a month. Which means we have an ungodly backlog of texts that need to be post-processed (over 450 books right now). Our proofing output has scaled up from last year, but our post-processing has not been able to keep up. There are plans for ways to improve the bottleneck. Unfortunately, developers to implement those ideas are another bottleneck. As big_bill at DP always says, though ... Those books aren't going anywhere and we will get to them eventually. :) JHutch ----- Original Message ----- From: Aaron Cannon Date: Wed, 15 Sep 2004 12:13:47 -0500 To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] interesting stat from the news letter > I usually skim the news letter to see if there is anything new, and I > noticed this little stat: > 352 Average Per Month in 2004 > 355 Average Per Month in 2003. > > Anyone have any theories as to why we are doing fewer books per month on > average this year? Not complaining, just a little curious. > > Sorry if this is old news. > > > Sincerely > Aaron Cannon > > -- > E-mail: cannona@fireantproductions.com > Skype: cannona > MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From Bowerbird at aol.com Wed Sep 15 14:27:32 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Sep 15 14:27:53 2004 Subject: [gutvol-d] interesting stat from the news letter Message-ID: frank said: > I think one reason that DP produces > a little less texts compared to > approximately one year ago is the fact that > more and more projects get HTML markup i believe that the .html converter programs (and the preparation of the plain-text files) should be improved to the point where they generate the .html version _automatically_. not only would this help cure the backlog, it would ensure that the .html versions are _uniform_ and _consistent_ across the library, which would make them _far_ more valuable... thank you for your time and consideration. -bowerbird From cannona at fireantproductions.com Wed Sep 15 18:11:17 2004 From: cannona at fireantproductions.com (Aaron Cannon) Date: Wed Sep 15 18:12:29 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> References: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> Message-ID: <6.1.2.0.0.20040915200916.01ecfd20@mail.fireantproductions.com> Good to know. I didn't realize that there was such a backlog. Good luck with that, especially with regards to automating the process. I would love to help, but I really have my hands full with the CD/DVD project. :) Sincerely Aaron Cannon At 01:40 PM 9/15/2004, you wrote: >DP produces about 200+ new books a month. Unfortunately, the proofers at >DP, finish about 250 books a month. Which means we have an ungodly >backlog of texts that need to be post-processed (over 450 books right >now). Our proofing output has scaled up from last year, but our >post-processing has not been able to keep up. There are plans for ways to >improve the bottleneck. Unfortunately, developers to implement those >ideas are another bottleneck. > >As big_bill at DP always says, though ... Those books aren't going >anywhere and we will get to them eventually. :) > >JHutch > >----- Original Message ----- >From: Aaron Cannon >Date: Wed, 15 Sep 2004 12:13:47 -0500 >To: gutvol-d@lists.pglaf.org >Subject: [gutvol-d] interesting stat from the news letter > > > I usually skim the news letter to see if there is anything new, and I > > noticed this little stat: > > 352 Average Per Month in 2004 > > 355 Average Per Month in 2003. > > > > Anyone have any theories as to why we are doing fewer books per month on > > average this year? Not complaining, just a little curious. > > > > Sorry if this is old news. > > > > > > Sincerely > > Aaron Cannon > > > > -- > > E-mail: cannona@fireantproductions.com > > Skype: cannona > > MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail > address.) > > > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From sly at victoria.tc.ca Wed Sep 15 18:21:08 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Sep 15 18:21:24 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> References: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> Message-ID: On Wed, 15 Sep 2004, Joshua Hutchinson wrote: > DP produces about 200+ new books a month. Unfortunately, the proofers at DP, finish about 250 books a month. Which means we have an ungodly backlog of texts that need to be post-processed (over 450 books right now). Interesting. I thought it was more than that. Right now, the list of "silver star" etexts (those which have finished first and second round, but have not yet finished post-proofing) has 1,890 titles. (With the oldest apparently going back to late 2002.) Andrew From pm002c3918 at blueyonder.co.uk Thu Sep 16 09:27:08 2004 From: pm002c3918 at blueyonder.co.uk (Miranda van de Heijning) Date: Thu Sep 16 09:27:29 2004 Subject: [gutvol-d] Top 100 References: <4145CB5A.1090604@perathoner.de> <4145D23C.6060100@perathoner.de><003701c499ba$a6b5de00$0302a8c0@PAULANDMIRANDA> <41475133.2040908@perathoner.de> Message-ID: <004001c49c0a$035d7900$0302a8c0@PAULANDMIRANDA> Thanks for the extra figures Marcello! I'm sure that will keep me usefully occupied on a weekly basis--there's a lot of interesting information in there. Not just for curiosity's sake, but also to determine what sort of books and authors are likely to get an audience. Of course this doesn't mean we should only work on the most popular works, but it would be a useful tool to identify any gaps we may have in the collection. Ala, just a thought. Miranda ----- Original Message ----- From: "Marcello Perathoner" To: "Project Gutenberg Volunteer Discussion" Sent: Tuesday, September 14, 2004 9:14 PM Subject: Re: [gutvol-d] Top 100 > Miranda van de Heijning wrote: > > > > Out of curiosity, which period of time does the list cover? And does it > > update automatically or will this list be outdated after a while? > > Since Sep 03. It updates every night. > > > > And how many times was the number one book actually downloaded? And number > > 2? And number 3? Ah well, you get the picture. I am actually very curious > > after number 100, the Anatomy of Melancholy's exact figures, it being one of > > DP's pet projects. > > I have added the numbers. > > > > Per Michael Hart's suggestion, I would also love to have more figures per > > category, perhaps also per language etc. Not sure if they are difficult to > > generate, but I love stats and there couldn't be enough of those on > > gutenberg.net in my view. > > We'll see ... > > > > -- > Marcello Perathoner > webmaster@gutenberg.net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > From joshua at hutchinson.net Mon Sep 13 10:01:35 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Sep 17 07:26:15 2004 Subject: [gutvol-d] Top 100 Message-ID: <20040913170135.7E1591096F2@ws6-4.us4.outblaze.com> ----- Original Message ----- From: Michael Hart Date: Mon, 13 Sep 2004 09:50:07 -0700 (PDT) To: Project Gutenberg Volunteer Discussion Subject: Re: [gutvol-d] Top 100 > > It turns out I had to disqualify those as well as a few others that have mp3 > > files and use the word "House" in the title. (Moreover, research has shown > > that "Usher" is a rap artist. :-) > > OK, perhaps I was just expecting something more serious here, but. . . . > > I would include all files, not sure why to disqualtify MP3 files, or "house" > remixes. . .hee hee! > > I think we should measure everything, though I think sub-lists would > be acceptable. . .such as the "Whole Top 100," then fiction, non-fiction, > .txt files, .htm files, .mp3 files, etc., etc., etc., > The numbers are inflated because people probably hit Google, typed mp3 and usher, and downloaded our file. The *thought* they were getting a music file by a popular artist, not a computer read version of the House of Usher. So, they most likely moved the file to the recycle bin immediately. That, and the RIAA probably hit our file half a billion times to check if it WAS an usher song. I'm surprised we haven't received any automated cease and desist letters from the brain trust lawyers at the RIAA. After all, by their normal logic, if it has USHER in the title, it must be their song. Josh From Bowerbird at aol.com Fri Sep 17 10:26:34 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Sep 17 10:26:44 2004 Subject: [gutvol-d] interesting stat from the news letter Message-ID: <88.14c0c2c3.2e7c784a@aol.com> aaron said: > I didn't realize that there was such a backlog. Good luck > with that, especially with regards to automating the process. um, my post in support of the automatic text-to-html converters didn't mean to suggest there is a special push on to perfect them... i know of no such focused effort. indeed, one can get one's head bitten off in the d.p. forums for suggesting improvement is needed, or even that a minimal standard of usefulness should be created. the modus operandi seems to be that every person is free to wander off and create their own idiosyncratic .html version, "as long as it validates." in my humble opinion, that's a recipe for a nonstandardized library -- the same one that has given us 14,000 inconsistent ascii texts -- and a nonstandardized library is shorthand for a less-useful library. that reminds me, david moynihan recently offered his extensive collection of thousands of .html versions to project gutenberg. the only substantial reaction i saw was "your files don't validate". what was the response to that kind offer? was it turned down? -bowerbird From traverso at dm.unipi.it Fri Sep 17 11:16:42 2004 From: traverso at dm.unipi.it (Carlo Traverso) Date: Fri Sep 17 11:16:52 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <6.1.2.0.0.20040915103948.01f94188@mail.scripps.edu> (message from Frank van Drogen on Wed, 15 Sep 2004 10:47:44 -0700) References: <6.1.2.0.0.20040915120920.01ed4e38@mail.fireantproductions.com> <6.1.2.0.0.20040915103948.01f94188@mail.scripps.edu> Message-ID: <200409171816.i8HIGg2m016246@posso.dm.unipi.it> >>>>> "Frank" == Frank van Drogen writes: >> I usually skim the news letter to see if there is anything new, >> and I noticed this little stat: 352 Average Per Month in 2004 >> 355 Average Per Month in 2003. >> >> Anyone have any theories as to why we are doing fewer books per >> month on average this year? Not complaining, just a little >> curious. Frank> I think one reason that DP produces a little less texts Frank> compared to approximately one year ago is the fact that Frank> more and more projects get HTML markup, which makes these Frank> texts a lot more usefull (IMHO), but reduces the output Frank> because people are still supposed to prepare the text Frank> version as well. So having to produce two versions, and the Frank> fact that adding markup, images, links between ToC and Frank> indices just takes more time reduces the quantity of the Frank> output. Having the HTML editions, however, definitely Frank> increases quality.... I don't think that DP has produced less texts; in one year, we passed to 2000 to 5000 posted books; but the percentage of non-DP contribution to PG has dropped to almost nothing. Part of the 2003 average however was due to inflated numbers: automatic MP3 and books with chapters posted with different numbers. I cannot say if the "freelance" volunteers have passed to DP, been discouraged by DP, or something else. It would be interesting to understand. Carlo From cannona at fireantproductions.com Fri Sep 17 14:30:17 2004 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Sep 17 14:30:45 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <200409171816.i8HIGg2m016246@posso.dm.unipi.it> References: <6.1.2.0.0.20040915120920.01ed4e38@mail.fireantproductions.com> <6.1.2.0.0.20040915103948.01f94188@mail.scripps.edu> <200409171816.i8HIGg2m016246@posso.dm.unipi.it> Message-ID: <6.1.2.0.0.20040917162443.01ebaec0@mail.fireantproductions.com> I would guess that most of the volunteers have gone over to DP. I have no information to backup this speculation, however, so take it for what it's worth. The reason I feel this way is simply because, from what I've seen, the DP model works better than the "old way" for the majority of projects. By the way, on an unrelated note, everyone who receives a CD or DVD from us is encouraged to go over and check out DP, so hopefully that has given you all a few more volunteers. Keep up the great work. Sincerely Aaron Cannon At 01:16 PM 9/17/2004, you wrote: > >>>>> "Frank" == Frank van Drogen writes: > > >> I usually skim the news letter to see if there is anything new, > >> and I noticed this little stat: 352 Average Per Month in 2004 > >> 355 Average Per Month in 2003. > >> > >> Anyone have any theories as to why we are doing fewer books per > >> month on average this year? Not complaining, just a little > >> curious. > > Frank> I think one reason that DP produces a little less texts > Frank> compared to approximately one year ago is the fact that > Frank> more and more projects get HTML markup, which makes these > Frank> texts a lot more usefull (IMHO), but reduces the output > Frank> because people are still supposed to prepare the text > Frank> version as well. So having to produce two versions, and the > Frank> fact that adding markup, images, links between ToC and > Frank> indices just takes more time reduces the quantity of the > Frank> output. Having the HTML editions, however, definitely > Frank> increases quality.... > >I don't think that DP has produced less texts; in one year, we passed >to 2000 to 5000 posted books; but the percentage of non-DP >contribution to PG has dropped to almost nothing. Part of the 2003 >average however was due to inflated numbers: automatic MP3 and books >with chapters posted with different numbers. > >I cannot say if the "freelance" volunteers have passed to DP, been >discouraged by DP, or something else. It would be interesting to >understand. > >Carlo > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From jeroen at bohol.ph Wed Sep 15 14:16:52 2004 From: jeroen at bohol.ph (Jeroen Hellingman) Date: Sat Sep 18 03:54:43 2004 Subject: [gutvol-d] interesting stat from the news letter In-Reply-To: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> References: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> Message-ID: <4148B144.6040900@bohol.ph> I'm one of those people guilty of big backlogs, with 13 books in various stages of completion in Post Processing after going through DP. I always do SGML, XML, and HTML, have heavily illustrated works with lots of tables, and these take a lot of work, no matter what the proofreaders do. A few times I've just left out the most horrendous tables, to type them personally, or even drop them from the work altogether. Then, adding ASCII versions also adds to the delay, as I can generate HTML from the SGML automatically, but ASCII simply is too hard to automate, which means redoing the tables again. An average novel can be PP-ed in a few hours, but those scientific works take many hours. My longest running project (not through DP) is Alberuni's India, with long citations in Greek, Persian, Sanskrit, etc., often in the original scripts (five different scripts in this book). Spread out over five years, hundreds of hours have gone into it, and it is not yet done. Jeroen. Joshua Hutchinson wrote: >DP produces about 200+ new books a month. Unfortunately, the proofers at DP, finish about 250 books a month. Which means we have an ungodly backlog of texts that need to be post-processed (over 450 books right now). Our proofing output has scaled up from last year, but our post-processing has not been able to keep up. There are plans for ways to improve the bottleneck. Unfortunately, developers to implement those ideas are another bottleneck. > >As big_bill at DP always says, though ... Those books aren't going anywhere and we will get to them eventually. :) > >JHutch > > From marcello at perathoner.de Sat Sep 18 10:38:03 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat Sep 18 10:39:08 2004 Subject: [gutvol-d] www.gutenberg.org Message-ID: <414C727B.6080406@perathoner.de> As of my request ibiblio has changed our apache virtual host ServerName from gutenberg.net to www.gutenberg.org. What? Nothing changes in web site operation except one little particular (if you hit a non-existing url you get redirected to www.gutenberg.org instead of gutenberg.net). Why? The .org top-level domain is meant for non-profit organizations while the .net domain is meant for network infrastructure such as providers, backbones etc. We own both gutenberg.net and gutenberg.org, but using the latter one is just more standard-compliant. Although both urls www.gutenberg.net and www.gutenberg.org give exactly the same results, you should start using www.gutenberg.org in all publications, papers etc. -- Marcello Perathoner webmaster@gutenberg.org From ke at gnu.franken.de Sat Sep 18 10:51:08 2004 From: ke at gnu.franken.de (Karl Eichwalder) Date: Sat Sep 18 10:45:06 2004 Subject: [gutvol-d] Re: interesting stat from the news letter In-Reply-To: <4148B144.6040900@bohol.ph> (Jeroen Hellingman's message of "Wed, 15 Sep 2004 23:16:52 +0200") References: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> <4148B144.6040900@bohol.ph> Message-ID: Jeroen Hellingman writes: > I'm one of those people guilty of big backlogs, with 13 books in various > stages of completion in Post Processing after going through DP. Thanks for your commitment! > I always do SGML, XML, and HTML, have heavily illustrated works with > lots of tables, and these take a lot of work, no matter what the > proofreaders do. A few times I've just left out the most horrendous > tables, to type them personally, or even drop them from the work > altogether. Things like these take many resources ;-( Nevertheless, droping contents alltogetther is not appropriate - better include them asis, even if they would look very wrong. Thus the reader can see this material and lend a helping hand... > Then, adding ASCII versions also adds to the delay, as I can generate > HTML from the SGML automatically, but ASCII simply is too hard to > automate, which means redoing the tables again. Heresy alert: As long as the book is readable with lynx, w3m, or another text browser, I wouldn't spend time on ASCII versions. Better consider to create PDFs from the SGML/XML source files. BTW, pdftotext from the xpdf suite is worth a try! > An average novel can be PP-ed in a few hours, but those scientific works > take many hours. My longest running project (not through DP) is > Alberuni's India, with long citations in Greek, Persian, Sanskrit, etc., > often in the original scripts (five different scripts in this book). > Spread out over five years, hundreds of hours have gone into it, and it > is not yet done. Yes, those books are not suitable for DP. Also other books could be proofread better and/or faster if one would loudly read the original text (plus punctions and stuff like that) and a partner proofreader would listen to the reader while checking the OCR results. (Of course, using a "Diktierger?t" (dictating machine and an appropriate player with pedals for control) you can go this way without a partner. -- | ,__o | _-\_<, http://www.gnu.franken.de/ke/ | (*)/'(*) From Bowerbird at aol.com Sat Sep 18 13:19:23 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Sep 18 13:19:40 2004 Subject: [gutvol-d] interesting stat from the news letter Message-ID: <100.19f1d84.2e7df24b@aol.com> jeroen said: > Then, adding ASCII versions also adds to the delay, > as I can generate HTML from the SGML automatically, > but ASCII simply is too hard to automate, > which means redoing the tables again. i am confused about your work-process, jeroen, perhaps you could explain it to me? it would seem to me that the _output_ of the proofing process is the raw text, which one polishes as the ascii version and then -- if one chooses -- marks up some way. but from what you say, you work a different way, the .sgml (magically?) comes _first_, and is used to generate other versions, including the plain-text. what gives? -bowerbird p.s. i guess i'm also perplexed _why_ it is difficult to generate plain-ascii from .sgml. isn't it a straightforward matter of converting (or, in some cases, stripping away) the markup? or printing to .pdf and recovering text from that? From shalesller at writeme.com Sat Sep 18 17:38:18 2004 From: shalesller at writeme.com (D. Starner) Date: Sat Sep 18 17:38:31 2004 Subject: [gutvol-d] Indexing Editors, etc. Message-ID: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> There are three books just added to the database that were scanned by me: Essay on Wit, The Story of Sigurd the Volsung, and An Enquiry Concerning the Principles of Taste. None of them have all the people listed on the title page included in the index. Why? At least in the case of Sigurd, the missing people rewrote a significant part of the text. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From gbnewby at pglaf.org Sat Sep 18 19:56:01 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sat Sep 18 19:56:03 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> References: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> Message-ID: <20040919025601.GA11878@pglaf.org> On Sat, Sep 18, 2004 at 04:38:18PM -0800, D. Starner wrote: > There are three books just added to the database that > were scanned by me: Essay on Wit, The Story of Sigurd > the Volsung, and An Enquiry Concerning the Principles > of Taste. None of them have all the people listed on > the title page included in the index. Why? At least > in the case of Sigurd, the missing people rewrote > a significant part of the text. Sorry about that, David. The reason is that the automatic cataloging program only picks up the metadata in the book header like Author:, Title:, etc. Email corrections (be specific, please!) to catalog@pglaf.org and we'll fix 'em (we=Andrew or I). You might also check whether the listing in GUTINDEX.ALL is correct for these entries. -- Greg From shalesller at writeme.com Sat Sep 18 22:03:21 2004 From: shalesller at writeme.com (D. Starner) Date: Sat Sep 18 22:03:38 2004 Subject: [gutvol-d] Indexing Editors, etc. Message-ID: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> Greg Newby writes: > Sorry about that, David. The reason is that the automatic cataloging > program only picks up the metadata in the book header like Author:, > Title:, etc. So is this something that I should take up within DP? What needs to be done that isn't? -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From sly at victoria.tc.ca Sat Sep 18 23:59:11 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat Sep 18 23:59:32 2004 Subject: [gutvol-d] Re: interesting stat from the news letter In-Reply-To: References: <20040915184039.BB9152F913@ws6-3.us4.outblaze.com> <4148B144.6040900@bohol.ph> Message-ID: On Sat, 18 Sep 2004, Karl Eichwalder wrote: > Yes, those books are not suitable for DP. Also other books could be > proofread better and/or faster if one would loudly read the original > text (plus punctions and stuff like that) and a partner proofreader > would listen to the reader while checking the OCR results. (Of course, > using a "Diktierger?t" (dictating machine and an appropriate player with > pedals for control) you can go this way without a partner. > Not being able to find a willing partner, I have occasionally before simulated this effect by reading out loud sections of a text (including my own shorthand conventions for punctuation), recording it on a small handheld voice recorder, and then playing it back while following along with the digitised version. I like the idea of the machine you mention... Andrew From gbnewby at pglaf.org Sun Sep 19 00:26:12 2004 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Sep 19 00:26:12 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> References: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> Message-ID: <20040919072612.GA15401@pglaf.org> On Sat, Sep 18, 2004 at 09:03:21PM -0800, D. Starner wrote: > Greg Newby writes: > > Sorry about that, David. The reason is that the automatic cataloging > > program only picks up the metadata in the book header like Author:, > > Title:, etc. > > So is this something that I should take up within DP? What needs to > be done that isn't? Catalog/index entries are created automatically (GUTINDEX.ALL is created by hand, from the Posted messages). So, it's probably best to do what you're doing: check the index the day after posting, and email changes/fixes to catalog@pglaf.org We *can* add fields like "Translator: " and "Illustrator: " to the eBook metadata, which are picked up by the automatic catalog creator. But for complex author lists we need to tweak the catalog entry by hand. I am not sure we can automate things much more than they are already, but if you see areas for improvement either with the DP process or the WW process, speak up and we'll see whether the ideas are viable. I expect that these types of problems will go away when we eventually start having richer (or at least better formatted) metadata in eBook files "born as XML," but with our current procedure there's a little too much variety in the eBook layout, author names/roles, and so forth to be able to create completely automatic catalog entries from the eBooks themselves. Thanks! -- Greg From traverso at dm.unipi.it Sun Sep 19 00:43:16 2004 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sun Sep 19 00:43:37 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> (shalesller@writeme.com) References: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> Message-ID: <200409190743.i8J7hGxv004874@posso.dm.unipi.it> >>>>> "D." == D Starner writes: D.> Greg Newby writes: >> Sorry about that, David. The reason is that the automatic >> cataloging program only picks up the metadata in the book >> header like Author:, Title:, etc. D.> So is this something that I should take up within DP? What D.> needs to be done that isn't? -- I think that something has rather to be done on PG end: include better metadata in the book header. Posting collects complete metadata, but only a tiny part finds its way to the top of the PG ebook. I don't see the point of collecting a lot of info, then discard it and include only a part. If it is felt to be intrusive at the top of the book, why do we not include complete metadata at the bottom, after the licence? The cataloguing program can search the bottom in addition/instead of the top. I would recommend searching the bottom, and if there is no metadata there, go to the top. This would not require changing anything in existing postings. Carlo From sly at victoria.tc.ca Sun Sep 19 01:18:52 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Sep 19 01:19:12 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> References: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> Message-ID: Hi David. I've just spent some time comparing and looking up names etc. and adding into the catalog extra information for the three titles you mention here. I could see problems with trying to automate this as it was not neccessarily straight-forward situation. I did make a judgement call in some cases. For instance, recording someone who wrote an introduction as a "contributor". Take a look and let me know what you think... On another topic, there does seem to be a bit of a gap. Sometimes a person involved in digitising a text will be aware of information of bibliographical interest, but has no easy way to pass that on to people who may be working on cataloging. On Sat, 18 Sep 2004, D. Starner wrote: > There are three books just added to the database that > were scanned by me: Essay on Wit, The Story of Sigurd > the Volsung, and An Enquiry Concerning the Principles > of Taste. None of them have all the people listed on > the title page included in the index. Why? At least > in the case of Sigurd, the missing people rewrote > a significant part of the text. > From marcello at perathoner.de Sun Sep 19 09:10:30 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Sep 19 09:11:35 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> References: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> Message-ID: <414DAF76.7070006@perathoner.de> D. Starner wrote: > Greg Newby writes: > >>Sorry about that, David. The reason is that the automatic cataloging >>program only picks up the metadata in the book header like Author:, >>Title:, etc. > > So is this something that I should take up within DP? What needs to > be done that isn't? We (WWs and me) have been discussing ways to fix the meta-data transfer between DP and the PG catalog. What we came up with is: - Put a unique identifier in the last line(s) of the text. This would allow the catalog database to query the database at DP for all missing info. or - Put a DC or XML/RDF metadata block at the end of the file. Example of DC metadata block: END OF THE PROJECT ... dc.author: Twain, Mark dc.title: 1601 dc.language: en dc.encoding: us-ascii dc.publisher: Project Gutenberg dc.rights: http://www.gutenberg.org/license pg.etext: 12345 pg.id: af04.bd32.1234.5678 EOF Example of RDF/XML metadata block: END OF THE PROJECT ... Project Gutenberg An Enquiry Concerning the Principles of Taste, and of the Origin of our Ideas of Beauty, etc. Reynolds, Frances Clifford, James L. [Contributor] en 2004-09-17 0123.4567.89ab.cdef EOF -- Marcello Perathoner webmaster@gutenberg.net From marcello at perathoner.de Sun Sep 19 09:17:25 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Sep 19 09:18:28 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919025601.GA11878@pglaf.org> References: <20040919003818.6FDCD4BDA8@ws1-1.us4.outblaze.com> <20040919025601.GA11878@pglaf.org> Message-ID: <414DB115.9060100@perathoner.de> Greg Newby wrote: > Sorry about that, David. The reason is that the automatic cataloging > program only picks up the metadata in the book header like Author:, > Title:, etc. Actually it picks up follwing roles if it finds a header line starting with the role followed by a semicolon. my $roles = { 'author' => undef, 'creator' => undef, 'translator' => undef, 'editor' => undef, 'compiler' => undef, 'illustrator' => undef, 'annotator' => undef, 'commentator' => undef, 'performer' => undef, }; The header lines have just to be generated ... -- Marcello Perathoner webmaster@gutenberg.net From marevalo at marevalo.net Sun Sep 19 09:38:41 2004 From: marevalo at marevalo.net (Miguel A. =?ISO-8859-1?Q?Ar=E9valo?=) Date: Sun Sep 19 09:38:12 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <414DAF76.7070006@perathoner.de> References: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> <414DAF76.7070006@perathoner.de> Message-ID: <1095611922.4494.2.camel@localhost.localdomain> And it would be great to have the complete bibliographical record of the book (o books) used as source for the digital edition on every new text. Regards, Miguel A. Ar?valo. El dom, 19-09-2004 a las 18:10 +0200, Marcello Perathoner escribi?: > What we came up with is: > > - Put a unique identifier in the last line(s) of the text. > This would allow the catalog database to query the database > at DP for all missing info. > > or > > - Put a DC or XML/RDF metadata block at the end of the file. From shalesller at writeme.com Sun Sep 19 12:12:27 2004 From: shalesller at writeme.com (D. Starner) Date: Sun Sep 19 12:12:38 2004 Subject: [gutvol-d] Indexing Editors, etc. Message-ID: <20040919191227.A4F254BDA8@ws1-1.us4.outblaze.com> Andrew Sly writes: > I did make a > judgement call in some cases. For instance, recording someone > who wrote an introduction as a "contributor". It seems odd that there should be a judgement call here (and it's one I've run into in the copyright clearances, too). A lot of books have introductions, and there should be a standard way of noting them. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From Bowerbird at aol.com Sun Sep 19 12:43:44 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Sep 19 12:44:03 2004 Subject: [gutvol-d] Indexing Editors, etc. Message-ID: <15a.3f9e42a9.2e7f3b70@aol.com> greg said: > I am not sure we can automate things > much more than they are already, this speaks volumes. (look, a pun!) in my ever-humble opinion, you _need_ to create files that _can_ be parsed unambiguously. and the number-one test of that is whether _you_ can parse 'em. a digital library where every book has to be handled _individually_ sacrifices far too much potential. > I expect that these types of problems > will go away when we eventually start > having richer (or at least better formatted) > metadata in eBook files "born as XML," > but with our current procedure there's > a little too much variety in the eBook layout, > author names/roles, and so forth to be able to > create completely automatic catalog entries > from the eBooks themselves. no comment. -bowerbird From sly at victoria.tc.ca Sun Sep 19 12:56:55 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Sep 19 12:57:05 2004 Subject: [gutvol-d] Indexing Editors, etc. In-Reply-To: <20040919191227.A4F254BDA8@ws1-1.us4.outblaze.com> References: <20040919191227.A4F254BDA8@ws1-1.us4.outblaze.com> Message-ID: On Sun, 19 Sep 2004, D. Starner wrote: > Andrew Sly writes: > > I did make a > > judgement call in some cases. For instance, recording someone > > who wrote an introduction as a "contributor". > > It seems odd that there should be a judgement call here (and > it's one I've run into in the copyright clearances, too). A lot > of books have introductions, and there should be a standard > way of noting them. > >From what I've seen, in "traditional" library catalogs, you are likely to see a more full transcription of information on the title page. i.e. The Moccasin Maker / By E. Pauline Johnson / With introduction by Sir Gilbert Parker / and appreciation by Charles Mair. But not to have any additional entries for other author's names. However, that has not been the practise at PG. (Particularly as, for earlier titles, there may be nothing left from the original Title page.) I suppose that if something seems significant enough to warrant more explanation, we can add a comment in the "Note" field. Andrew From cannona at fireantproductions.com Sun Sep 19 15:54:04 2004 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Sep 19 15:54:46 2004 Subject: [gutvol-d] ot: gmail Message-ID: <6.1.2.0.0.20040919175134.01be0e20@mail.fireantproductions.com> If any volunteers have a need for a gmail account (I.E. sending large book images back and forth) E-mail me off list and I can give you one. -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From hacker at gnu-designs.com Sun Sep 19 16:29:45 2004 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sun Sep 19 16:30:59 2004 Subject: [gutvol-d] ot: gmail In-Reply-To: <6.1.2.0.0.20040919175134.01be0e20@mail.fireantproductions.com> References: <6.1.2.0.0.20040919175134.01be0e20@mail.fireantproductions.com> Message-ID: > If any volunteers have a need for a gmail account (I.E. sending large > book images back and forth) E-mail me off list and I can give you one. Better yet, use the GMail-o-Matic at isnoop. http://isnoop.net/gmailomatic.php David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From juliet.sutherland at verizon.net Sun Sep 19 09:41:31 2004 From: juliet.sutherland at verizon.net (Juliet Sutherland) Date: Sun Sep 19 21:57:16 2004 Subject: [gutvol-d] Indexing Editors, etc. References: <20040919050321.339DF1CE303@ws1-6.us4.outblaze.com> <414DAF76.7070006@perathoner.de> Message-ID: <010d01c49e67$86d3bce0$6401a8c0@Unicorn> DP does generate a DC file for each project. I'm not entirely sure what's in it, although I presume that it captures the information that we collect, which is Title Author Language Genre (by our definition, not an official cataloging one) We do not collect information about publication dates, multiple authors or creative content roles, publisher, etc. In these regards the new PG clearance system collects much more information, and is probably much more accurate as well. Many project managers (including myself) tend to shorten or adjust the titles so that they fit better on the project listing page. Similarly with author/illustrator/editior/etc information. This is appropriate and useful for our internal purposes, but doesn't work well when mapped to anything external. All in all, I'd recommend using the information collected as part of the copyright clearance as a basis for cataloging. JulietS ----- Original Message ----- From: "Marcello Perathoner" To: "Project Gutenberg Volunteer Discussion" Sent: Sunday, September 19, 2004 12:10 PM Subject: Re: [gutvol-d] Indexing Editors, etc. > D. Starner wrote: > >> Greg Newby writes: >> >>>Sorry about that, David. The reason is that the automatic cataloging >>>program only picks up the metadata in the book header like Author:, >>>Title:, etc. >> >> So is this something that I should take up within DP? What needs to >> be done that isn't? > > We (WWs and me) have been discussing ways to fix the meta-data transfer > between DP and the PG catalog. > > What we came up with is: > > - Put a unique identifier in the last line(s) of the text. > This would allow the catalog database to query the database > at DP for all missing info. > > or > > - Put a DC or XML/RDF metadata block at the end of the file. > > > Example of DC metadata block: > > END OF THE PROJECT ... > > dc.author: Twain, Mark > dc.title: 1601 > dc.language: en > dc.encoding: us-ascii > dc.publisher: Project Gutenberg > dc.rights: http://www.gutenberg.org/license > pg.etext: 12345 > pg.id: af04.bd32.1234.5678 > > EOF > > > Example of RDF/XML metadata block: > > END OF THE PROJECT ... > > xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" > xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:dcterms="http://purl.org/dc/terms/" > xmlns:pg="http://www.gutenberg.org/pgrdf" > xml:base="http://www.gutenberg.org/rdf/catalog.rdf"> > > > Project Gutenberg > An Enquiry Concerning the Principles > of Taste, and of the Origin of our Ideas of Beauty, etc. > Reynolds, Frances > Clifford, James L. [Contributor] > en > 2004-09-17 > > 0123.4567.89ab.cdef > > > > > EOF > > > -- > Marcello Perathoner > webmaster@gutenberg.net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lYP5g.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From marcello at perathoner.de Tue Sep 21 08:31:21 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Sep 21 06:31:10 2004 Subject: [gutvol-d] blogged by our russian friends Message-ID: <41504949.4040201@perathoner.de> Our russian friends were so charmed by our ebook "Hand Shadows to Be Thrown upon the Wall by Henry Bursill" http://www.gutenberg.net/etext/12962 that they downloaded it 4000 times. Somebody who reads Russian can go to http://www.dirty.ru/comments/16939 and tell us why. Its interesting to see how articles wander from one blog to the next. These are just some of them who carry the "Hand Shadow" story. http://www.di-links.com/modules.php?name=News&file=article&sid=419 http://www.robcruickshank.net/2004_09_01_archive.html#109495485107159780 http://exclamationmark.typepad.com/ http://www.wherethreadscomeloose.com/links.html http://www.monkeyfilter.com/ -- Marcello Perathoner webmaster@gutenberg.net From joel at oneporpoise.com Tue Sep 21 08:02:07 2004 From: joel at oneporpoise.com (Joel A. Erickson) Date: Tue Sep 21 08:01:38 2004 Subject: [gutvol-d] blogged by our russian friends References: <41504949.4040201@perathoner.de> Message-ID: <000601c49feb$f6c62c10$6501a8c0@JOEL> > Our russian friends were so charmed by our ebook "Hand Shadows to Be > Thrown upon the Wall by Henry Bursill" > > http://www.gutenberg.net/etext/12962 > > that they downloaded it 4000 times. > > > Somebody who reads Russian can go to > > http://www.dirty.ru/comments/16939 > > and tell us why. According to an online Russian-English translator, some of the comments as follows: "Fine completely. I have tried - and look, it has turned out. People in 1859 fingers were able to bend!" "Dream of my childhood. A class post! Doubtless [+] for today." "All represent animals. And where shadow a tractor, nuclear reactors and supersonic fighters? But in fact 1859..." "and shadow economy ." "And here still interestingly collective image of shadow designs." From Bowerbird at aol.com Tue Sep 21 09:59:48 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Sep 21 09:59:58 2004 Subject: [gutvol-d] re: posted Digest, Vol 4, Issue 27 Message-ID: this just in from the "posted" list... david said: > This compilation of all Mark Twain's works > published by Project Gutenberg was first posted > three years ago and has been updated with > additions and corrections three times since > -- this is the fourth and most extensive. > On this renovation the entire 15mb file > of 302,000 lines has been reprocessed > using present day PG proofing tools with > the correction of several thousand errors. yay! -bowerbird From sly at victoria.tc.ca Tue Sep 21 10:26:02 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Sep 21 10:26:09 2004 Subject: [gutvol-d] blogged by our russian friends In-Reply-To: <41504949.4040201@perathoner.de> References: <41504949.4040201@perathoner.de> Message-ID: On Tue, 21 Sep 2004, Marcello Perathoner wrote: > Our russian friends were so charmed by our ebook "Hand Shadows to Be > Thrown upon the Wall by Henry Bursill" > > http://www.gutenberg.net/etext/12962 > > that they downloaded it 4000 times. Now if there was some way we could bridge the language barrier and recruit a few of these Russian friends to help add material in their native language to PG.... Could we do that using hand shadows on the wall? Andrew From marcello at perathoner.de Wed Sep 22 06:44:33 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Sep 22 06:44:54 2004 Subject: [gutvol-d] New Online Reader Message-ID: <415181C1.6060507@perathoner.de> There is a new experimental online reader available. Start from any bibliographic record page, eg. http://www.gutenberg.net/etext/4300 Basically this paginates the txt file and remembers your last position in a cookie so you can later resume reading where you left off. Please test it. It should work with any book that has a text file where the encoding is known. -- Marcello Perathoner webmaster@gutenberg.net From meredydd at everybuddy.com Wed Sep 22 08:50:02 2004 From: meredydd at everybuddy.com (Meredydd) Date: Wed Sep 22 07:50:00 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <415181C1.6060507@perathoner.de> References: <415181C1.6060507@perathoner.de> Message-ID: <200409221650.04191.meredydd@everybuddy.com> Hey, thanks a lot, Marcello, that's fantastic! Have you spammed the newsletter people (who's doing it nowadays?) about this? I think it's well worth a mention... Incidentally, I notice that the prefatory materials, in particular, page-break neatly on the *** START OF THE PROJECT GUTENBERG ETEXT... line. Is this pure coincidence, or deliberate? If deliberate, is it a one-time heuristic that recognises the asterisks (or something similar), or do you do other heuristics in an attempt to locate chapter headings? How good do you find they are? Meredydd On Wednesday 22 September 2004 14:44, Marcello Perathoner wrote: > There is a new experimental online reader available. Start from any > bibliographic record page, eg. > > http://www.gutenberg.net/etext/4300 > > > Basically this paginates the txt file and remembers your last > position in a cookie so you can later resume reading where you left > off. > > Please test it. It should work with any book that has a text file > where the encoding is known. From marcello at perathoner.de Wed Sep 22 07:55:37 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Sep 22 07:56:02 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <200409221650.04191.meredydd@everybuddy.com> References: <415181C1.6060507@perathoner.de> <200409221650.04191.meredydd@everybuddy.com> Message-ID: <41519269.9000300@perathoner.de> Meredydd wrote: > Incidentally, I notice that the prefatory materials, in particular, > page-break neatly on the *** START OF THE PROJECT GUTENBERG ETEXT... > line. Is this pure coincidence, or deliberate? If deliberate, is it a > one-time heuristic that recognises the asterisks (or something > similar), or do you do other heuristics in an attempt to locate chapter > headings? How good do you find they are? The script just goes 50 lines down and then breaks on the first empty line. -- Marcello Perathoner webmaster@gutenberg.org From hart at pglaf.org Wed Sep 22 07:56:09 2004 From: hart at pglaf.org (Michael Hart) Date: Wed Sep 22 07:56:11 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <200409221650.04191.meredydd@everybuddy.com> References: <415181C1.6060507@perathoner.de> <200409221650.04191.meredydd@everybuddy.com> Message-ID: I'm putting this in today's Newsletter. Michael On Wed, 22 Sep 2004, Meredydd wrote: > Hey, thanks a lot, Marcello, that's fantastic! > > Have you spammed the newsletter people (who's doing it nowadays?) about > this? I think it's well worth a mention... > > Incidentally, I notice that the prefatory materials, in particular, > page-break neatly on the *** START OF THE PROJECT GUTENBERG ETEXT... > line. Is this pure coincidence, or deliberate? If deliberate, is it a > one-time heuristic that recognises the asterisks (or something > similar), or do you do other heuristics in an attempt to locate chapter > headings? How good do you find they are? > > Meredydd > > On Wednesday 22 September 2004 14:44, Marcello Perathoner wrote: >> There is a new experimental online reader available. Start from any >> bibliographic record page, eg. >> >> http://www.gutenberg.net/etext/4300 >> >> >> Basically this paginates the txt file and remembers your last >> position in a cookie so you can later resume reading where you left >> off. >> >> Please test it. It should work with any book that has a text file >> where the encoding is known. > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From joel at oneporpoise.com Wed Sep 22 08:05:14 2004 From: joel at oneporpoise.com (Joel A. Erickson) Date: Wed Sep 22 08:09:14 2004 Subject: [gutvol-d] New Online Reader References: <415181C1.6060507@perathoner.de> Message-ID: <001601c4a0b5$91078390$6501a8c0@JOEL> I picked "Two Years Before the Mast" from the the Top 100, and got nothing but blank pages. Joel > There is a new experimental online reader available. Start from any > bibliographic record page, eg. > > http://www.gutenberg.net/etext/4300 > > > Basically this paginates the txt file and remembers your last position in > a cookie so you can later resume reading where you left off. > > Please test it. It should work with any book that has a text file where > the encoding is known. From marcello at perathoner.de Wed Sep 22 08:20:15 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Sep 22 08:20:36 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <001601c4a0b5$91078390$6501a8c0@JOEL> References: <415181C1.6060507@perathoner.de> <001601c4a0b5$91078390$6501a8c0@JOEL> Message-ID: <4151982F.4070700@perathoner.de> Joel A. Erickson wrote: > I picked "Two Years Before the Mast" from the the Top 100, and got > nothing but blank pages. Fixed. That text has no encoding information. I'll assume all those are ASCII texts. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Wed Sep 22 09:25:33 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Sep 22 09:25:45 2004 Subject: [gutvol-d] speaking of readers Message-ID: speaking of readers, i've got a new version of my viewer-program ready for upload today for the first day of fall. to join the beta-test, just send an e-mail to: zml_talk@yahoogroups.com thank you for your time... -bowerbird p.s. the onset of fall is in 5 minutes, at 9:30 pacific... From hart at pglaf.org Wed Sep 22 13:39:24 2004 From: hart at pglaf.org (Michael Hart) Date: Wed Sep 22 13:39:25 2004 Subject: [gutvol-d] Re: [BP] [gweekly] Pt2 Project Gutenberg Weekly Newsletter (fwd) Message-ID: ---------- Forwarded message ---------- Date: Wed, 22 Sep 2004 15:03:43 -0400 From: Stewart C. Russell To: Book People mailing list Subject: Re: [BP] [gweekly] Pt2 Project Gutenberg Weekly Newsletter Project Gutenberg Newsletter wrote: > > http://www.gutenberg.net/etext/4300 > > Basically this paginates the txt file and remembers your last position > in a cookie so you can later resume reading where you left off. A worthy effort, but it doesn't use standard document navigation, so I can't use the "Top / Up / First / Previous / Next / Last / Document / More" standard navigation bar that my browser gives me. Conforming documents make this feature one of the great joys of using Mozilla. Stewart [Moderator: For more about standard document navigation links in HTML, see http://www.w3.org/TR/html4/types.html#type-links . Note that Mozilla's link toolbar appears to use more types of link relationships than are specified in the HTML standard, but it does also appear to recognize the standard ones. - JMO] ----------------------------------------------------------------------------- This message was sent via the Book People mailing list. Posting address: spok+bookpeople@cs.cmu.edu Admin. & unsubscribe address: spok+bookpeople-request@cs.cmu.edu Charter: http://onlinebooks.library.upenn.edu/bplist/ From stephen.thomas at adelaide.edu.au Wed Sep 22 17:27:53 2004 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Wed Sep 22 17:28:11 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <415181C1.6060507@perathoner.de> References: <415181C1.6060507@perathoner.de> Message-ID: <41521889.6000005@adelaide.edu.au> Not to rain on your parade, but ... compare with my script for converting PG .txt to html on the fly: http://isis.library.adelaide.edu.au/cgi-bin/pg-html/pg/etext03/ulyss12.txt Some attempt at reformatting would be nice. Your cookie idea for remembering where you got to is a nice touch -- but this seems to be the only justification for splitting a work into 50-line segments. 50 seems completely arbitrary -- 25 would probably fit the whole page into my screen, so I wouldn't need to scroll. 200 -- or 2000 -- would save me clicking on Next Page so often. I like the My Bookmarks feature. But I'd still rather download to my Palm, which gives me all these features and lets me take it away from my desk. Steve Marcello Perathoner wrote: > There is a new experimental online reader available. Start from any > bibliographic record page, eg. > > http://www.gutenberg.net/etext/4300 > > > Basically this paginates the txt file and remembers your last position > in a cookie so you can later resume reading where you left off. > > Please test it. It should work with any book that has a text file > where the encoding is known. > > -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ Free books at eBooks@Adelaide, http://etext.library.adelaide.edu.au/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From tb at baechler.net Wed Sep 22 23:06:06 2004 From: tb at baechler.net (Tony Baechler) Date: Wed Sep 22 23:05:58 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <415181C1.6060507@perathoner.de> Message-ID: <5.2.0.9.0.20040922230203.02811cd0@snoopy2.trkhosting.com> Hello. I looked at the online reader and find it useful but I see one thing which needs to be fixed. There needs to be a way to jump to a specific page. I had to follow the "next page" link several times to get past the standard PG header. I agree that the PG header is important for legal reasons and the public should know as much about PG as possible, but for the older ebooks this can be very long and many people won't have the patience to try to scroll through. I know that this would break usual procedure, but could the reader be set to skip the header entirely? I am thinking that all of the ebooks will be reposted eventually with a shortened PG header anyway and I wouldn't want people drawn away. It might also be nice to let the user decide a page size since it apparently works by deciding that a page is x bytes in the file and adjusting the offset accordingly. From marcello at perathoner.de Thu Sep 23 04:39:34 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 23 04:39:58 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <41521889.6000005@adelaide.edu.au> References: <415181C1.6060507@perathoner.de> <41521889.6000005@adelaide.edu.au> Message-ID: <4152B5F6.7010300@perathoner.de> Steve Thomas wrote: > Not to rain on your parade, but ... compare with my script for > converting PG .txt to html on the fly: > > http://isis.library.adelaide.edu.au/cgi-bin/pg-html/pg/etext03/ulyss12.txt > > Some attempt at reformatting would be nice. I'll take your text to show why I am opposed to purely automatic reformatting of text. Your text (in the 2nd paragraph) says: ?Introibo Ad Altare Dei. where it should say ?Introibo ad altare Dei. Capitalization rules for English titles should not be applied to Latin text were they are completely inadequate. Also: Amoroso Ma Non Troppo. Von Der Sirenen Listigkeit Tun Die Poeten Dichten. Und Alle Schiffe Brucken. Tete-A-Tete The missing accents combined with the erroneous capitalization make the last 2 examples a really outstanding example of text corruption. Its better IMO to present an ugly but correct text than a pretty but corrupted one. This is the reason I decided against purely automatic reformatting. N.B. I'm not against automatic reformatting (into TEI) and then proofing the text again. > Your cookie idea for remembering where you got to is a nice touch -- but > this seems to be the only justification for splitting a work into > 50-line segments. 50 seems completely arbitrary -- 25 would probably fit > the whole page into my screen, so I wouldn't need to scroll. 200 -- or > 2000 -- would save me clicking on Next Page so often. That also could be stored in a preferences cookie. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Sep 23 04:43:30 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 23 04:43:54 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <5.2.0.9.0.20040922230203.02811cd0@snoopy2.trkhosting.com> References: <5.2.0.9.0.20040922230203.02811cd0@snoopy2.trkhosting.com> Message-ID: <4152B6E2.3010306@perathoner.de> Tony Baechler wrote: > Hello. I looked at the online reader and find it useful but I see one > thing which needs to be fixed. There needs to be a way to jump to a > specific page. I had to follow the "next page" link several times to > get past the standard PG header. Two issues here: - do we really want people to ignore "skip" the header? - The "standard header" is not standard at all. You need guessing to skip the header (something computers are not very good at.) But this could be fixed. > I agree that the PG header is > important for legal reasons and the public should know as much about PG > as possible, but for the older ebooks this can be very long and many > people won't have the patience to try to scroll through. The reposting of all texts will make this issue go away pretty soon. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Sep 23 04:49:56 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 23 04:50:21 2004 Subject: [gutvol-d] Re: [BP] [gweekly] Pt2 Project Gutenberg Weekly Newsletter (fwd) In-Reply-To: References: Message-ID: <4152B864.4030408@perathoner.de> Michael Hart wrote: > A worthy effort, but it doesn't use standard document navigation, so I > can't use the "Top / Up / First / Previous / Next / Last / Document / > More" standard navigation bar that my browser gives me. Conforming > documents make this feature one of the great joys of using Mozilla. Nice toolbar. I didn't know I had this one :-) Navigation will come soon, in fact I've already experimented with the "preload" link but stupid mozilla doesn't preload pages that need parameters. Why it refuses to preload a page I tell him its okay to preload, I don't understand. -- Marcello Perathoner webmaster@gutenberg.org From stephen.thomas at adelaide.edu.au Thu Sep 23 05:48:58 2004 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Thu Sep 23 05:49:06 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <4152B5F6.7010300@perathoner.de> References: <415181C1.6060507@perathoner.de> <41521889.6000005@adelaide.edu.au> <4152B5F6.7010300@perathoner.de> Message-ID: <4152C63A.5060205@adelaide.edu.au> Marcello Perathoner wrote: > > I'll take your text to show why I am opposed to purely automatic > reformatting of text. > > Your text (in the 2nd paragraph) says: > > ?Introibo Ad Altare Dei. > > where it should say > > ?Introibo ad altare Dei. > > Capitalization rules for English titles should not be applied to Latin > text were they are completely inadequate. Also: > > Amoroso Ma Non Troppo. > > Von Der Sirenen Listigkeit > Tun Die Poeten Dichten. > Und Alle Schiffe Brucken. > > Tete-A-Tete > > The missing accents combined with the erroneous capitalization make the > last 2 examples a really outstanding example of text corruption. > > Its better IMO to present an ugly but correct text than a pretty but > corrupted one. This is the reason I decided against purely automatic > reformatting. Hmmm -- interesting point. My script reformats the PG text, which contains UND ALLE SCHIFFE BRUCKEN TETE_A_TETE etc. So, I regret to inform you that the ORIGINAL PG text is "corrupt" -- nothing to do with my script. GIGO. You have a point about capitalisation rules -- but when the original text uses ALL CAPS to represent italicised words, there's a limit to what can be done -- but we've had this discussion before, so I'll not repeat it here. Suffice to say that it is perfectly possible to apply at least some minimal formatting to improve readability, without corrupting the text. -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ From marcello at perathoner.de Thu Sep 23 05:59:27 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 23 05:59:32 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <4152C63A.5060205@adelaide.edu.au> References: <415181C1.6060507@perathoner.de> <41521889.6000005@adelaide.edu.au> <4152B5F6.7010300@perathoner.de> <4152C63A.5060205@adelaide.edu.au> Message-ID: <4152C8AF.9090202@perathoner.de> Steve Thomas wrote: > Suffice to say that it is perfectly possible to apply at least some > minimal formatting to improve readability, without corrupting the text. But that would be very minimal: - replace fixed font with proportional line per line - replace leading spaces with fixed width one And even this would mangle some tables. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Sep 23 06:06:54 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 23 06:07:00 2004 Subject: [gutvol-d] New Online Reader In-Reply-To: <4152C63A.5060205@adelaide.edu.au> References: <415181C1.6060507@perathoner.de> <41521889.6000005@adelaide.edu.au> <4152B5F6.7010300@perathoner.de> <4152C63A.5060205@adelaide.edu.au> Message-ID: <4152CA6E.7050004@perathoner.de> Steve Thomas wrote: > So, I regret to inform you that the ORIGINAL PG text is "corrupt" -- > nothing to do with my script. GIGO. Nope. There is a difference between claiming: I have not recorded the capitalization of this sentence like the original PG text does and the capitalization of this sentence is so and so like your version does. Everybody who sees "INTROIBO AT ALTARE DEI" recognizes that some information has been lost. But a reader not familiar with Latin might not be aware that "Introibo Ad Altare Dei" is all wrong. Bottom line: if information has been lost, never use guessing to recover it but go back to the source. -- Marcello Perathoner webmaster@gutenberg.org From hart at pglaf.org Thu Sep 23 07:00:58 2004 From: hart at pglaf.org (Michael Hart) Date: Thu Sep 23 07:01:00 2004 Subject: [gutvol-d] Re: [BP] [gweekly] Pt2 Project Gutenberg Weekly Newsletter (fwd) In-Reply-To: <4152B864.4030408@perathoner.de> References: <4152B864.4030408@perathoner.de> Message-ID: Just to clarify, I did NOT write the quotation attribted to me below, probably an artifact of multiple replies in the mailer not being all included in the last reply. Michael Hart On Thu, 23 Sep 2004, Marcello Perathoner wrote: > Michael Hart wrote: > >> A worthy effort, but it doesn't use standard document navigation, so I >> can't use the "Top / Up / First / Previous / Next / Last / Document / >> More" standard navigation bar that my browser gives me. Conforming >> documents make this feature one of the great joys of using Mozilla. > > Nice toolbar. I didn't know I had this one :-) > > Navigation will come soon, in fact I've already experimented with the > "preload" link but stupid mozilla doesn't preload pages that need parameters. > Why it refuses to preload a page I tell him its okay to preload, I don't > understand. > > > > -- > Marcello Perathoner > webmaster@gutenberg.org > From Bowerbird at aol.com Thu Sep 23 12:49:31 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Sep 23 12:49:46 2004 Subject: [gutvol-d] New Online Reader Message-ID: <1d6.2b54795a.2e8482cb@aol.com> marcello said: > Bottom line: if information has been lost, > never use guessing to recover it > but go back to the source. wrong. in an ideal world, with unlimited resources, sure. but in the real world of project gutenberg now, where no one seems even the least bit eager to _ever_ "go back to the source" and correct that practice of using all-caps to indicate emphasis? for _that_, a more-practical strategy is called for. if, by using guessing, the information that you "restore" is right in the vast majority of the cases (e.g., 93-98%), then you _should_ do that. you should also, of course, clearly _mark_ all of the changes that you have made, so the users know they need to be evaluated with care. then, as i have argued here quite extensively in the past, you need to put into place a robust mechanism that will encourage users to report errors, and then act on them... but you still want to go ahead and make those changes... -bowerbird From joshua at hutchinson.net Thu Sep 23 13:11:13 2004 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Sep 23 13:11:31 2004 Subject: [gutvol-d] New Online Reader Message-ID: <20040923201113.4DA342FA02@ws6-3.us4.outblaze.com> Wrong. If you are introducing a 2-7% error rate into the texts (those are your figures) that is absolutely and completely unacceptable. You are making them WORSE than they were before. Josh ----- Original Message ----- From: Bowerbird@aol.com Date: Thu, 23 Sep 2004 15:49:31 EDT To: gutvol-d@lists.pglaf.org, Bowerbird@aol.com Subject: re: Re: [gutvol-d] New Online Reader > marcello said: > > Bottom line: if information has been lost, > > never use guessing to recover it > > but go back to the source. > > wrong. > > in an ideal world, with unlimited resources, sure. > > but in the real world of project gutenberg now, > where no one seems even the least bit eager to > _ever_ "go back to the source" and correct that > practice of using all-caps to indicate emphasis? > > for _that_, a more-practical strategy is called for. > > if, by using guessing, the information that you "restore" > is right in the vast majority of the cases (e.g., 93-98%), > then you _should_ do that. you should also, of course, > clearly _mark_ all of the changes that you have made, > so the users know they need to be evaluated with care. > > then, as i have argued here quite extensively in the past, > you need to put into place a robust mechanism that will > encourage users to report errors, and then act on them... > > but you still want to go ahead and make those changes... > > -bowerbird > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From Bowerbird at aol.com Thu Sep 23 14:58:58 2004 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Sep 23 14:59:13 2004 Subject: [gutvol-d] New Online Reader Message-ID: <19a.2a15377c.2e84a122@aol.com> joshua said: > If you are introducing a 2-7% error rate > into the texts (those are your figures) that is > absolutely and completely unacceptable. > You are making them WORSE than they were before. no, the texts are "in error" as they stand. those all-caps errors are no better than a no-caps error or an initial-caps error, even if you have become inured to them. and as long as the auto-changes are marked, anyone can reverse them, so there is no harm in making the "guesses". none at all. _zero._ -bowerbird From sly at victoria.tc.ca Sat Sep 25 23:46:11 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat Sep 25 23:46:32 2004 Subject: [gutvol-d] Hand Shadows to Be Thrown upon the Wall Message-ID: For those who are interested, the lastest download numbers for Hand Shadows to Be Thrown upon the Wall by Henry Bursill are 16027 (just from ibiblio) Not bad for a book that was only released two months ago. Andrew From sly at victoria.tc.ca Tue Sep 28 15:57:57 2004 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Sep 28 15:58:08 2004 Subject: [gutvol-d] PG mentioned on usenet Message-ID: I periodically search usenet newgroups to see how and where Project Gutenberg is mentioned. Here's an excerpt from a recent posting to misc.writing which shows the kind of reaction I hope many people are having when they download a PG text. Andrew Note: LW=Little Women So I didn't actually read LW until I was an adult, and forced to read it out of professional obligation. I'm still not that horribly KEEN on it, but it introduced me to other works by Alcott, and she was one heckuva lady. The voice she uses in her "adult" books is totally different from her kid lit. As part of my recent research I just finished "Hospital Sketches"--based on her brief career as a Civil War nurse--and it's so like a precursor to M*A*S*H, sort of this horrifying tragicomic mix of bureaucracy and death. It's a short read, available at Project Gutenberg if folks are so inclined. (Warning, though: I opened up the text file just to make sure it didn't get corrupted and wound up dropping everything to read the whole thing.) From marcello at perathoner.de Thu Sep 30 01:43:03 2004 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Sep 30 01:43:26 2004 Subject: [gutvol-d] [Fwd: Project Gutenberg Featured on October Ibiblio page] Message-ID: <415BC717.3040205@perathoner.de> -------- Original Message -------- Date: Wed, 29 Sep 2004 16:11:21 -0400 (EDT) From: Sean Slovney To: webmaster@gutenberg.org Subject: Project Gutenberg Featured on October Ibiblio page Ibiblio is pleased to announce that Project Gutenberg is selected be featured on the main page on ibiblio.org for the month of October. We wanted to notify you ahead of time incase you need to make changes in preparation for the potential increase in traffic on your site, and to congratulate you for your contributions to ibiblio. If you need anything from us or if being featured on the main page is a problem, please let us know before the end of September. Thanks. -- Sean Slovney epyon@ibiblio.org (919) 962-5646 -- Marcello Perathoner webmaster@gutenberg.org