From jon at noring.name Wed Aug 1 16:18:54 2007 From: jon at noring.name (Jon Noring) Date: Wed, 1 Aug 2007 17:18:54 -0600 Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) Message-ID: <116143487.20070801171854@noring.name> An excellent blog article by Bill McCoy at Adobe: http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html Definitely of interest to the Project Gutenberg and Distributed Proofreaders folk. Jon Noring From Bowerbird at aol.com Wed Aug 1 18:35:59 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 1 Aug 2007 21:35:59 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: well, the #7 book in the harry potter series was translated into chinese by volunteers... that probably doesn't surprise you. didn't surprise me. the best complete version (there were lots, but most were incomplete) took just 2 days. that probably doesn't surprise you either. didn't surprise me. what _did_ surprise me, a little, is that the effort was coordinated by a person who is just a high-school freshman... the team of translators was composed of high-school and university students. kids these days... > http://www.poynter.org/column.asp?id=31&aid=127529 > http://www.zonaeuropa.com/20070728_1.htm -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070801/3255e8ca/attachment.htm From mkengel at gmail.com Wed Aug 1 19:27:18 2007 From: mkengel at gmail.com (Michael Engel) Date: Thu, 2 Aug 2007 11:27:18 +0900 (JST) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: <1034739.23579.XAMJXAlJE1M=.1186021638.squirrel@webmailer.hosteurope.de> > what _did_ surprise me, a little, is that > the effort was coordinated by a person > who is just a high-school freshman... Let's call it organized crime or an organization of thieves. I guess the publisher of Harry Potter will have a word with them. From Bowerbird at aol.com Wed Aug 1 20:54:35 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 1 Aug 2007 23:54:35 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: mkengel said: > Let's call it organized crime or an organization of thieves. actually, since it wasn't done for profit, it was legal in china. and given the way that country is slurping up the u.s. dollars, they're going to be calling _all_ the shots before we know it, so you might want to start getting used to their perspective... plus, a poll conducted by the translators found that people who downloaded this translation overwhelmingly said they would buy the "official" translation when it became available. so in actuality, these are hard-core _fans_, and not "thieves". and you do know that old bromide about how the customer is always right... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070801/d79830ee/attachment.htm From mkengel at gmail.com Wed Aug 1 22:40:12 2007 From: mkengel at gmail.com (Michael Engel) Date: Thu, 2 Aug 2007 14:40:12 +0900 (JST) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: <1034739.20361.XAMJXAlJE1M=.1186033212.squirrel@webmailer.hosteurope.de> > actually, since it wasn't done for profit, it was legal in china. Actually, you don't cite correctly. On the website, they wrote "they think they didn't do illegal things". That is very different. The translation might be legal but not the making it available for download. > plus, a poll conducted by the translators found that people > who downloaded this translation overwhelmingly said they > would buy the "official" translation when it became available. The several hundred thousands of downloaders will all buy the book when it is available. Come on, do you really believe this ? I have a part of the moon to sell, then. For you, I will make a special price. > so in actuality, these are hard-core _fans_, and not "thieves". > and you do know that old bromide about how the customer > is always right... This doesn't fit here, as they don't pay the publisher or the author. From Bowerbird at aol.com Wed Aug 1 23:23:03 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 2 Aug 2007 02:23:03 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: mkengel said: > The translation might be legal but > not the making it available for download. i didn't read it that way. but i'm not a lawyer, not even here in the states, let alone in china, so there's little call for me to speculate on it... > The several hundred thousands of downloaders > will all buy the book when it is available. Come on, > do you really believe this? given that each book in this series has sold millions upon millions, there's little reason for me to doubt it. i think all the people who downloaded this translation know full well that it was done by amateurs, and thus might indeed be interested in one from professionals. and i bet if the professional one had been released now, instead of 3 months from now, the interest in this one would have been less intense, more related to curiosity. but if you want the story -- and _lots_ of readers did -- then you're going to jump at anything reasonably close. heck, some hard-core people were jumping at fan fiction. > This doesn't fit here, as they don't > pay the publisher or the author. i believe a good number of 'em will indeed pay them. and the ones that won't? they wouldn't have anyway, meaning that no sales to paying customers were lost. unlike some people, i'm not upset that some readers may have gotten the story "for free", because i see people walking out of the library with "free books" all the time. i myself have read lots of books i never paid a nickel for. and they made me a better person. but all this is beside my original point, which is that the kids are creating their future with their behavior, and all us old fogeys can do is sit around and watch... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/7c3ec949/attachment.htm From prosfilaes at gmail.com Thu Aug 2 00:38:17 2007 From: prosfilaes at gmail.com (David Starner) Date: Thu, 2 Aug 2007 02:38:17 -0500 Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) In-Reply-To: <116143487.20070801171854@noring.name> References: <116143487.20070801171854@noring.name> Message-ID: <6d99d1fd0708020038ne468fbdqeb043d7cdea92fa4@mail.gmail.com> On 8/1/07, Jon Noring wrote: > An excellent blog article by Bill McCoy at Adobe: > > http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html > > Definitely of interest to the Project Gutenberg and Distributed > Proofreaders folk. I don't see why it's of much interest to us; we already remove the watermarks and use the scans indiscriminately for our own purposes. I find it a little overblown to complain about something that's very clearly framed as a request. From schultzk at uni-trier.de Thu Aug 2 01:23:02 2007 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Thu, 2 Aug 2007 10:23:02 +0200 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: Hi, Translations are a funny thing as far as copyright is concerned. In the passed there have been official and unofficial translations published commericially. I can not cite any, but I have seen them. Organized Crime no, loop holes, yes. As an acedemic I am free to translate and publish a literary texts. Especially, comtempary texts of rising authors are not often availible. Of course I would not publish a complete work, yet parts that I find ample for my work. If these kids are not taking a profit, there is not much RK is going to get and at the most an injunction to pull it off the web-site. I doubt very much if the chinese judicicial system will persue these kids. I f they do what are they going to do with them ? RK would be best advise to check how good their transaltion is and use it if it is half way decent. I think it was a cute idea. It would seem to be a good model for other projects. Heh, DP get in contact with these kids. They could boost your output!! I am not kidding. regards Keith. Am 02.08.2007 um 05:54 schrieb Bowerbird at aol.com: > mkengel said: > > Let's call it organized crime or an organization of thieves. > > actually, since it wasn't done for profit, it was legal in china. > and given the way that country is slurping up the u.s. dollars, > they're going to be calling _all_ the shots before we know it, > so you might want to start getting used to their perspective... > > plus, a poll conducted by the translators found that people > who downloaded this translation overwhelmingly said they > would buy the "official" translation when it became available. > > so in actuality, these are hard-core _fans_, and not "thieves". > > and you do know that old bromide about how the customer > is always right... > > -bowerbird > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/860cd517/attachment-0001.htm From hart at pglaf.org Thu Aug 2 04:38:34 2007 From: hart at pglaf.org (Michael Hart) Date: Thu, 2 Aug 2007 04:38:34 -0700 (PDT) Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) In-Reply-To: <6d99d1fd0708020038ne468fbdqeb043d7cdea92fa4@mail.gmail.com> References: <116143487.20070801171854@noring.name> <6d99d1fd0708020038ne468fbdqeb043d7cdea92fa4@mail.gmail.com> Message-ID: On Thu, 2 Aug 2007, David Starner wrote: > On 8/1/07, Jon Noring wrote: >> An excellent blog article by Bill McCoy at Adobe: >> >> http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html >> >> Definitely of interest to the Project Gutenberg and Distributed >> Proofreaders folk. > > I don't see why it's of much interest to us; we already remove the > watermarks and use the scans indiscriminately for our own purposes. > > I find it a little overblown to complain about something that's very > clearly framed as a request. Well, if you look at the U of VA, not only do they claim a copyright on most or all of their public domain books, but they also claim to be the the oldest and the largest eBook library in the world. Just one example. . . . mh From hart at pglaf.org Thu Aug 2 04:45:43 2007 From: hart at pglaf.org (Michael Hart) Date: Thu, 2 Aug 2007 04:45:43 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: If you do your homework you will find a number of studies all pointing out that placing free eBooks online creates an increased retail market demand for those books. Every year another study or two, I have yet to see one come up on the other side of the coin. mh On Thu, 2 Aug 2007, Bowerbird at aol.com wrote: > mkengel said: >> The translation might be legal but >> not the making it available for download. > > i didn't read it that way. but i'm not a lawyer, > not even here in the states, let alone in china, > so there's little call for me to speculate on it... > > >> The several hundred thousands of downloaders >> will all buy the book when it is available. Come on, >> do you really believe this? > > given that each book in this series has sold millions > upon millions, there's little reason for me to doubt it. > > i think all the people who downloaded this translation > know full well that it was done by amateurs, and thus > might indeed be interested in one from professionals. > > and i bet if the professional one had been released now, > instead of 3 months from now, the interest in this one > would have been less intense, more related to curiosity. > but if you want the story -- and _lots_ of readers did -- > then you're going to jump at anything reasonably close. > heck, some hard-core people were jumping at fan fiction. > > >> This doesn't fit here, as they don't >> pay the publisher or the author. > > i believe a good number of 'em will indeed pay them. > and the ones that won't? they wouldn't have anyway, > meaning that no sales to paying customers were lost. > > unlike some people, i'm not upset that some readers > may have gotten the story "for free", because i see > people walking out of the library with "free books" > all the time. i myself have read lots of books i never > paid a nickel for. and they made me a better person. > > but all this is beside my original point, which is that > the kids are creating their future with their behavior, > and all us old fogeys can do is sit around and watch... > > -bowerbird > > > > ************************************** > Get a sneak peek of the all-new AOL at > http://discover.aol.com/memed/aolcom30tour > From Bowerbird at aol.com Thu Aug 2 08:09:22 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 2 Aug 2007 11:09:22 EDT Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) Message-ID: > http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html you'd think bill mccoy would have better things to do than this silliness. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/f248a409/attachment.htm From hart at pglaf.org Thu Aug 2 09:30:57 2007 From: hart at pglaf.org (Michael Hart) Date: Thu, 2 Aug 2007 09:30:57 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese Message-ID: Any idea how to contact Xiao Wang??? Or the other QQ Chinese translators? Michael From Bowerbird at aol.com Thu Aug 2 09:30:54 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 2 Aug 2007 12:30:54 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: michael said: > If you do your homework you will find a number of studies > all pointing out that placing free eBooks online creates > an increased retail market demand for those books. right. and in the future, the _only_ way to get noticed _at_all_ -- if you want to make any hard-copy sales -- will be to put a free copy online. because you'll be competing against others who've put up a free copy of their book -- many of them with little intention of making sales -- and thus will be soaking up all the available attention... oh hey wait, if we consider _blogs_ to be "serial books", that's _already_ happening, isn't it? a lot of the "casual" recreation-type reading is now channeled toward blogs. the wide variety is a welcome change-of-pace from the bestseller-clones the publishing industry foists on us... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/a24960c0/attachment.htm From hart at pglaf.org Thu Aug 2 09:38:33 2007 From: hart at pglaf.org (Michael Hart) Date: Thu, 2 Aug 2007 09:38:33 -0700 (PDT) Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) In-Reply-To: References: Message-ID: On Thu, 2 Aug 2007, Bowerbird at aol.com wrote: >> http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html > > you'd think bill mccoy would have better things to do than this silliness. > > -bowerbird > My guess is that he gets a bonus for it! mh From ajhaines at shaw.ca Thu Aug 2 10:11:47 2007 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Thu, 02 Aug 2007 10:11:47 -0700 Subject: [gutvol-d] Welsh title page translation Message-ID: <000c01c7d528$35c612d0$6401a8c0@ahainesp2400> I have a small book of Welsh poetry. It's dated 1913 and is not in PG. The book uses the standard English alphabet, except for a few circumflexed vowels. According to the PG Copyright How-To, an English-language translation of the title page should accompany the TP&V scans for copyright clearance. How good/complete a translation should this be? Is it necessary to translate only the basic title/author/editor info, or should the whole page be translated? Would a translation pieced together from an on-line Welsh-English dictionary (e.g. http://www.geiriadur.net/) be sufficient? (I'm aware there are a number of on-line Welsh/English dictionaries/translators, but I haven't looked into them thoroughly. Recommendations are welcome.) If a full/non-amateur translation is needed, is there someone out there that can translate this title page for me if I send them a scan of it? Regards, Al From shabam.dp at gmail.com Thu Aug 2 10:53:24 2007 From: shabam.dp at gmail.com (Jason Isbell (shabam)) Date: Thu, 2 Aug 2007 10:53:24 -0700 Subject: [gutvol-d] Welsh title page translation In-Reply-To: <000c01c7d528$35c612d0$6401a8c0@ahainesp2400> References: <000c01c7d528$35c612d0$6401a8c0@ahainesp2400> Message-ID: <1b68e26b0708021053h779ad5f3o53abf6a4fb2e0fe3@mail.gmail.com> Basically, it just needs to be enough that the copyright team can understand what they need to know about the book. The ones I have done have neither been complete, nor "good" just enough so that the copyright team can confidently agree with me that it is indeed in the public domain. Jason On 8/2/07, Al Haines (shaw) wrote: > > I have a small book of Welsh poetry. It's dated 1913 and is not in > PG. The > book uses the standard English alphabet, except for a few circumflexed > vowels. > > According to the PG Copyright How-To, an English-language translation of > the > title page should accompany the TP&V scans for copyright clearance. > > How good/complete a translation should this be? Is it necessary to > translate only the basic title/author/editor info, or should the whole > page > be translated? Would a translation pieced together from an on-line > Welsh-English dictionary (e.g. http://www.geiriadur.net/) be sufficient? > (I'm aware there are a number of on-line Welsh/English > dictionaries/translators, but I haven't looked into them thoroughly. > Recommendations are welcome.) > > If a full/non-amateur translation is needed, is there someone out there > that > can translate this title page for me if I send them a scan of it? > > Regards, > Al > > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/9cf4045a/attachment.htm From mkengel at gmail.com Thu Aug 2 14:47:08 2007 From: mkengel at gmail.com (Michael Engel) Date: Fri, 3 Aug 2007 06:47:08 +0900 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: > If you do your homework you will find a number of studies > all pointing out that placing free eBooks online creates > an increased retail market demand for those books. > > Every year another study or two, I have yet to see one > come up on the other side of the coin. Is this the freedom of authors to chose how they want to get their books translated ? According to your thinking, authors don't have any rights. Great ! From Bowerbird at aol.com Thu Aug 2 15:58:45 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 2 Aug 2007 18:58:45 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: mkengel said: > Is this the freedom of authors to chose > how they want to get their books translated ? > According to your thinking, authors don't have any rights. > Great ! "love is a stranger in an open car... take you in and drive you far away..." --eurythmics -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070802/d3409da8/attachment.htm From joshua at hutchinson.net Thu Aug 2 17:04:43 2007 From: joshua at hutchinson.net (joshua at hutchinson.net) Date: Fri, 3 Aug 2007 00:04:43 +0000 (UTC) Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) Message-ID: <17631327.1186099483045.JavaMail.?@fh1063.dia.cp.net> Now if the blog had complained about U of Va ... then he'd have a valid point. He really doesn't with Google since anyone with a brain and the ability to read can see that Google is merely making a request, not a legally binding copyright claim. (And indeed, a request that I've seen in plenty of open source programs, too.) Josh >----Original Message---- >From: hart at pglaf.org >Date: Aug 2, 2007 7:38 >To: "Project Gutenberg Volunteer Discussion" >Subj: Re: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) > > >On Thu, 2 Aug 2007, David Starner wrote: > >> On 8/1/07, Jon Noring wrote: >>> An excellent blog article by Bill McCoy at Adobe: >>> >>> http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html >>> >>> Definitely of interest to the Project Gutenberg and Distributed >>> Proofreaders folk. >> >> I don't see why it's of much interest to us; we already remove the >> watermarks and use the scans indiscriminately for our own purposes. >> >> I find it a little overblown to complain about something that's very >> clearly framed as a request. > >Well, if you look at the U of VA, not only do they claim >a copyright on most or all of their public domain books, >but they also claim to be the the oldest and the largest >eBook library in the world. > >Just one example. . . . > > >mh >_______________________________________________ >gutvol-d mailing list >gutvol-d at lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > From lee at novomail.net Thu Aug 2 17:00:00 2007 From: lee at novomail.net (Lee Passey) Date: Thu, 02 Aug 2007 18:00:00 -0600 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: <46B27000.1060908@novomail.net> Michael Engel wrote: >> If you do your homework you will find a number of studies >> all pointing out that placing free eBooks online creates >> an increased retail market demand for those books. >> >> Every year another study or two, I have yet to see one >> come up on the other side of the coin. > > > Is this the freedom of authors to chose how they want to get their > books translated ? According to your thinking, authors don't have any > rights. > > Great ! We hold these truths to be self-evident: that all authors are endowed by the Governments of various jurisdictions with a certain set of Privileges to promote the common welfare. That whenever any Privilege so endowed becomes destructive of these ends, it is the Power and Authority of said Governments to alter or to abolish these Privileges, or to grant new Privileges, laying their foundation on such principles as to them shall seem most likely to effect the common good of the Populace. -- Nothing of significance below this line. From lee at novomail.net Thu Aug 2 18:11:20 2007 From: lee at novomail.net (Lee Passey) Date: Thu, 02 Aug 2007 19:11:20 -0600 Subject: [gutvol-d] Unwrap lines utility? In-Reply-To: <000001c7bfd8$6f4c48e0$1f12fea9@sarek> References: <000001c7bfd8$6f4c48e0$1f12fea9@sarek> Message-ID: <46B280B8.7040406@novomail.net> John Hagerson wrote: > I am corresponding with someone who would like to be able to unwrap the > paragraphs from some of the older, plain text, material in our collection. I > provided him the naive, three search-and-replace solution, but he says that > his attempt to implement it on his computer with the file he has chosen > causes his word processor to lock up. > > He is running Microsoft Windows XP. Has anyone already written a utility to > do this? If so, please send me a pointer to it. > > Thank you very much. http://sno2.iwarp.com/ebook-faq/documents/textify.html -- Nothing of significance below this line. From mkengel at dinj.de Thu Aug 2 23:50:36 2007 From: mkengel at dinj.de (Michael Engel) Date: Fri, 3 Aug 2007 15:50:36 +0900 (JST) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> > If you do your homework you will find a number of studies > all pointing out that placing free eBooks online creates > an increased retail market demand for those books. But a decision to make it online available is the right of the author - not yours - and not the right of these students. Project Gutenberg takes so much care about not violating the copyright of authors but suddenly people on the list find it "cute" that some Chinese students violate it so clearly. Do you have double standards ? From prosfilaes at gmail.com Fri Aug 3 05:44:38 2007 From: prosfilaes at gmail.com (David Starner) Date: Fri, 3 Aug 2007 07:44:38 -0500 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> References: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> Message-ID: <6d99d1fd0708030544v6eefd1e4lb6bdd46f2c161765@mail.gmail.com> On 8/3/07, Michael Engel wrote: > Project Gutenberg takes so much care about not violating the copyright of > authors but suddenly people on the list find it "cute" that some Chinese > students violate it so clearly. > > Do you have double standards ? First place, you don't have to like the law to find it advisable to follow it. Project Gutenberg doesn't follow the law because the people that make up PG like the law as written, but because if PG ignored the law, its mission of providing a large collection of books to a large group of people in a permanently available library would be compromised. I believe that PG filed an amicus brief on the court case challenging the Mickey Mouse Copyright Extension Act, indicating that it in fact does not agree with the current copyright laws. Secondly, the people on this list are not PG. They are, at best, members of PG's board, and most are just volunteers. You don't have to share all the goals and values of an organization to volunteer some time for them; you just have to believe that working with the organization will do good. There are Boy Scouts, for example, who don't agree with the organizations views on homosexuality and on religion. That doesn't mean they have double standards; that just means that they consider the work they do with the Boy Scouts valuable despite it. From hart at pglaf.org Fri Aug 3 08:00:42 2007 From: hart at pglaf.org (Michael Hart) Date: Fri, 3 Aug 2007 08:00:42 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> References: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> Message-ID: On Fri, 3 Aug 2007, Michael Engel wrote: >> If you do your homework you will find a number of studies >> all pointing out that placing free eBooks online creates >> an increased retail market demand for those books. > > But a decision to make it online available is the right of the author - > not yours - and not the right of these students. > > Project Gutenberg takes so much care about not violating the copyright of > authors but suddenly people on the list find it "cute" that some Chinese > students violate it so clearly. > > Do you have double standards ? Juat pointing out the facts. mh From hart at pglaf.org Fri Aug 3 08:06:21 2007 From: hart at pglaf.org (Michael Hart) Date: Fri, 3 Aug 2007 08:06:21 -0700 (PDT) Subject: [gutvol-d] Unwrap lines utility? In-Reply-To: <46B280B8.7040406@novomail.net> References: <000001c7bfd8$6f4c48e0$1f12fea9@sarek> <46B280B8.7040406@novomail.net> Message-ID: We listed a few not that long ago, I remember one calded: "clippy" but there are probably hundreds, if not thousands. On Thu, 2 Aug 2007, Lee Passey wrote: > John Hagerson wrote: >> I am corresponding with someone who would like to be able to unwrap the >> paragraphs from some of the older, plain text, material in our collection. I >> provided him the naive, three search-and-replace solution, but he says that >> his attempt to implement it on his computer with the file he has chosen >> causes his word processor to lock up. >> >> He is running Microsoft Windows XP. Has anyone already written a utility to >> do this? If so, please send me a pointer to it. >> >> Thank you very much. > > http://sno2.iwarp.com/ebook-faq/documents/textify.html > > -- > Nothing of significance below this line. > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Fri Aug 3 08:09:12 2007 From: hart at pglaf.org (Michael Hart) Date: Fri, 3 Aug 2007 08:09:12 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <46B27000.1060908@novomail.net> References: <46B27000.1060908@novomail.net> Message-ID: On Thu, 2 Aug 2007, Lee Passey wrote: > Michael Engel wrote: > >>> If you do your homework you will find a number of studies >>> all pointing out that placing free eBooks online creates >>> an increased retail market demand for those books. >>> >>> Every year another study or two, I have yet to see one >>> come up on the other side of the coin. >> >> >> Is this the freedom of authors to chose how they want to get their >> books translated ? According to your thinking, authors don't have any >> rights. >> >> Great ! > Don't try to add words to my emails, they don't line up. I am just stating a fact, not advocating YOUR action for doing something illegal. . .who do you think set up this copyright research program, anyway? mh > We hold these truths to be self-evident: that all authors are endowed by > the Governments of various jurisdictions with a certain set of > Privileges to promote the common welfare. That whenever any Privilege so > endowed becomes destructive of these ends, it is the Power and Authority > of said Governments to alter or to abolish these Privileges, or to grant > new Privileges, laying their foundation on such principles as to them > shall seem most likely to effect the common good of the Populace. > > -- > Nothing of significance below this line. > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Fri Aug 3 08:14:07 2007 From: hart at pglaf.org (Michael Hart) Date: Fri, 3 Aug 2007 08:14:07 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: > mkengel said: >> Is this the freedom of authors to chose >> how they want to get their books translated ? >> According to your thinking, authors don't have any rights. >> Great ! Actaually, it's the public that doesn't have any rights. Copyright, after 300 years of SAYING it's temporary, limited, etc., finally admits it can be permanent... according to the US Supreme Court copyright can have as many extensions for as long as they want. Apparently there was no other side to the copyrights coinage in the name of the public domain. Somehow I can't see the Founding Fathers doing this. mh From hart at pglaf.org Fri Aug 3 08:29:49 2007 From: hart at pglaf.org (Michael Hart) Date: Fri, 3 Aug 2007 08:29:49 -0700 (PDT) Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: References: Message-ID: I notice no one replied to this one, or mentioned the current lawsuits for misuse of copyright. . .hehee! On Thu, 2 Aug 2007, Bowerbird at aol.com wrote: > michael said: >> If you do your homework you will find a number of studies >> all pointing out that placing free eBooks online creates >> an increased retail market demand for those books. > > right. > > and in the future, the _only_ way to get noticed _at_all_ > -- if you want to make any hard-copy sales -- will be > to put a free copy online. because you'll be competing > against others who've put up a free copy of their book > -- many of them with little intention of making sales -- > and thus will be soaking up all the available attention... > > oh hey wait, if we consider _blogs_ to be "serial books", > that's _already_ happening, isn't it? a lot of the "casual" > recreation-type reading is now channeled toward blogs. > the wide variety is a welcome change-of-pace from the > bestseller-clones the publishing industry foists on us... > > -bowerbird > > > > ************************************** > Get a sneak peek of the all-new AOL at > http://discover.aol.com/memed/aolcom30tour > From marcello at perathoner.de Fri Aug 3 10:49:13 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri, 03 Aug 2007 19:49:13 +0200 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> References: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> Message-ID: <46B36A99.5020709@perathoner.de> Michael Engel wrote: >> If you do your homework you will find a number of studies >> all pointing out that placing free eBooks online creates >> an increased retail market demand for those books. > > But a decision to make it online available is the right of the author - > not yours - and not the right of these students. > > Project Gutenberg takes so much care about not violating the copyright of > authors but suddenly people on the list find it "cute" that some Chinese > students violate it so clearly. > > Do you have double standards ? According to Marx, the most productive class will necessarily take over. There's no doubt to me as to which class is the more productive: One one side we have a big capitalistic publishing corporation and it takes them months to translate a book and to distribute it on a resource-wasting dead tree substrate. On the other side we have a bunch of loosely connected young people and it takes them 2 days to translate a book and to distribute it on a sleek machine readable substrate. The whole copyright question will become moot in a few decades: - Free software is already better than proprietary software. - Free music is already better than proprietary music. - Free contents will soon be better than proprietary contents. Just think of the billions it takes M$ or the RIAA to convince people that their crap is better than the free alternatives. Authors didn't relinquish their rights to publishing companies gladly. They were forced to, because publishing companies were the only ones that had the know-how of distribution. The internet is a far better medium for distribution than a publishing company. A publisher is somebody who separates the wheat from the chaff and prints the chaff, said Mark Twain. The internet is where authors will print the wheat. Intellectual property has no place in an age where contents can be replicated for free. See: dodo. -- Marcello Perathoner webmaster at gutenberg.org From ricardofdiogo at gmail.com Fri Aug 3 11:49:17 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Fri, 3 Aug 2007 19:49:17 +0100 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <46B36A99.5020709@perathoner.de> References: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> <46B36A99.5020709@perathoner.de> Message-ID: <9c6138c50708031149t6b26782ew722628228dce342f@mail.gmail.com> 2007/8/3, Marcello Perathoner : > Authors didn't relinquish their rights to publishing companies gladly. > They were forced to, because publishing companies were the only ones > that had the know-how of distribution. The internet is a far better > medium for distribution than a publishing company. A publisher is > somebody who separates the wheat from the chaff and prints the chaff, > said Mark Twain. The internet is where authors will print the wheat. > Totally agree. However authors will always ask someone to take care of certain aspects of their work, like advertising it. > > Intellectual property has no place in an age where contents can be > replicated for free. See: dodo. > The paradigm of copyright and author's rights _has to_ and _will_ change. The current paradygm is that the compensation for the author's effort is supported by the end-consumer, who contributes to the royalties when buying the product. Before Internet you could already xerox a book or copy a CD borrowed from a friend. But now you can copy millions of products and disseminate them in seconds to millions of people. No cyberpolice in the world can deal with it. Basically we are watching a massive pillage. Now there are two options: whether you try to fight this pillage with already existing paradygms (cyberpolice, shutting down web servers, DRMs etc etc) _or_ you change the entire system. That is the great challenge for the next decade. In the future, products have to be delivered for free while still compensating the author. One possibility: adding advertisements to the books (already done). Problem: advertisement saturation. Another option, who knows, is a big corporation (like Sony, MS, whatever) buying a book and making it officially available for free at their website. The author would get their money, the public would get their contents, the corporations would get more visitors. Imagine that the only way you can get Stephan Kind's last hit is by going to Sony's webpage... wouldn't fans go there? would they still use Emule? Why bothering? Now imagine an official database linking all authors by corporation. Wouldn't it be profitable? I'm just thinking out loud here. Bottom line is: someone has to pay to the author but not the consumer. That concept is dead. From Bowerbird at aol.com Fri Aug 3 12:48:00 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 3 Aug 2007 15:48:00 EDT Subject: [gutvol-d] harry #7 translated to chinese Message-ID: mkengel said: > Project Gutenberg takes so much care about > not violating the copyright of authors > but suddenly people on the list find it "cute" > that some Chinese students violate it so clearly. your certainty here is _badly_ misplaced, because nobody "violated" any copyright. not legally, and not even "ethically", since -- to the people on my side of the fence, which includes all of the creative people that i know -- translations are _a_gift_... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070803/3922cc85/attachment.htm From j.hagerson at comcast.net Fri Aug 3 17:46:26 2007 From: j.hagerson at comcast.net (John Hagerson) Date: Fri, 3 Aug 2007 19:46:26 -0500 Subject: [gutvol-d] FW: harry #7 translated to chinese Message-ID: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> > -----Original Message----- > From: gutvol-d-bounces at lists.pglaf.org [mailto:gutvol-d- > bounces at lists.pglaf.org] On Behalf Of Ricardo F Diogo > Sent: Friday, August 03, 2007 1:49 PM > To: Project Gutenberg Volunteer Discussion > Subject: Re: [gutvol-d] harry #7 translated to chinese > > 2007/8/3, Marcello Perathoner : > > Authors didn't relinquish their rights to publishing companies gladly. > > They were forced to, because publishing companies were the only ones > > that had the know-how of distribution. The internet is a far better > > medium for distribution than a publishing company. A publisher is > > somebody who separates the wheat from the chaff and prints the chaff, > > said Mark Twain. The internet is where authors will print the wheat. > > > Totally agree. However authors will always ask someone to take care of > certain aspects of their work, like advertising it. > > > > Intellectual property has no place in an age where contents can be > > replicated for free. See: dodo. > > > The paradigm of copyright and author's rights _has to_ and _will_ change. > The current paradygm is that the compensation for the author's effort > is supported by the end-consumer, who contributes to the royalties > when buying the product. > Before Internet you could already xerox a book or copy a CD borrowed > from a friend. But now you can copy millions of products and > disseminate them in seconds to millions of people. No cyberpolice in > the world can deal with it. Basically we are watching a massive > pillage. > Now there are two options: whether you try to fight this pillage with > already existing paradygms (cyberpolice, shutting down web servers, > DRMs etc etc) _or_ you change the entire system. That is the great > challenge for the next decade. > In the future, products have to be delivered for free while still > compensating the author. One possibility: adding advertisements to the > books (already done). Problem: advertisement saturation. > Another option, who knows, is a big corporation (like Sony, MS, > whatever) buying a book and making it officially available for free at > their website. The author would get their money, the public would get > their contents, the corporations would get more visitors. > Imagine that the only way you can get Stephan Kind's last hit is by > going to Sony's webpage... wouldn't fans go there? would they still > use Emule? Why bothering? Now imagine an official database linking all > authors by corporation. Wouldn't it be profitable? > I'm just thinking out loud here. Bottom line is: someone has to pay to > the author but not the consumer. That concept is dead. And what benefit would the theoretical corporation receive for buying Stephen King's next "best seller" and giving it away? I put quotation marks around "best seller" because only one copy of the book is actually sold in this model (where a sale is defined as a voluntary transaction between a willing buyer and a willing seller). Why would Sony or Microsoft invest in the considerable bandwidth and server capacity required to do this? The shareholders of these two companies would not be very happy at the waste of their corporate resources. This sounds like the BBC model of TV where the owner of every television in Britain pays a government tax and Channel 1 and Channel 2 are "free." The industry needs to continue a model where willing buyers pay willing sellers. If that is through advertising, donations, or a large, well funded foundation that compensates authors and builds its reputation by distributing literature for "free," we can't devolve into the government content provision model. From ricardofdiogo at gmail.com Fri Aug 3 20:53:45 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Sat, 4 Aug 2007 04:53:45 +0100 Subject: [gutvol-d] FW: harry #7 translated to chinese In-Reply-To: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> References: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> Message-ID: <9c6138c50708032053h59f20ee9qaf4e2ab9024ff093@mail.gmail.com> 2007/8/4, John Hagerson : > > Why would Sony or Microsoft invest in > the considerable bandwidth and server capacity required to do this? Why does Google? (And Microsoft is already doing it too by the way.) > The industry needs to continue a model where willing buyers pay willing > sellers. If that is through advertising, donations, or a large, well funded > foundation that compensates authors and builds its reputation by > distributing literature for "free," we can't devolve into the government > content provision model. > Willing buyers? I don't see the number of willing buyers raising too much these days. _I am absolutely sure_ that final consumers will not want to pay for what they can get for free, even if it's illegal. Two options: or authors/publishers find a way of using the "free" concept in their favour or they can cry whatever they want because no government in the world has the ability to face the problem (even if they all created a cybercop for each user or prohibited all sorts of copies). What authors do today is claiming that the Internet is decreasing the number of sellings... they don't seem to understand the basics: if sellings are decreasing is because people don't want to _buy_ their books!! And they ask for anti-piracy policies so that they can keep their traditional model while they should be studying a new model where _piracy isn't just needed_. Your foundation idea seems nice... Although PGLAF doesn't have to care about paying royalties, I'm pretty sure that lots of people would donate us money so that we could pay Stephen King for his works and release them for free. And again, if the names of the donators were public I'm sure we'd have lots of banks and all sorts of corporations donating. From marcello at perathoner.de Fri Aug 3 21:18:56 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Sat, 04 Aug 2007 06:18:56 +0200 Subject: [gutvol-d] FW: harry #7 translated to chinese In-Reply-To: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> References: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> Message-ID: <46B3FE30.70505@perathoner.de> John Hagerson wrote: > The industry needs to continue a model where willing buyers pay willing > sellers. My argument was that we don't need "industry" any more, because we have technically evolved beyond the need of "industry". We can and do build better software, make better music and write better books without "industry". And it's only a question of time when we will start making better chips and better pharmaceuticals ... The the "bazaar" model of production using the internet makes people more productive than the "cathedral" model using corporate resources. -- Marcello Perathoner webmaster at gutenberg.org From Bowerbird at aol.com Fri Aug 3 22:20:54 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Sat, 4 Aug 2007 01:20:54 EDT Subject: [gutvol-d] =?iso-8859-1?q?FW=3A=A0_harry_=237_translated_to_chine?= =?iso-8859-1?q?se?= Message-ID: john said: > we can't devolve into the government content provision model. sure we could. but nope, we probably shouldn't. and we definitely don't need to, so probably won't. as i said earlier, the tendency will be for authors to put their work online, available to readers for free... the return cycle on this exchange-of-gifts is when readers then reward the authors of books we like... not that this will be limited to books... indeed, _most_ forms of content will eventually be distributed this way. we can already start to see this happening with bands offering their music for free on places like myspace... with collaborative filtering doing the important job of separating the wheat from the chaff -- at the level of each of our individual tastes -- there is no real need to have the content-corporations to be middle-men... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070804/665c9d5e/attachment.htm From hart at pglaf.org Sun Aug 5 08:42:21 2007 From: hart at pglaf.org (Michael Hart) Date: Sun, 5 Aug 2007 08:42:21 -0700 (PDT) Subject: [gutvol-d] FW: harry #7 translated to chinese In-Reply-To: <46B3FE30.70505@perathoner.de> References: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> <46B3FE30.70505@perathoner.de> Message-ID: On Sat, 4 Aug 2007, Marcello Perathoner wrote: > John Hagerson wrote: > >> The industry needs to continue a model where willing >> buyers pay willing sellers. > > My argument was that we don't need "industry" any more, > because we have technically evolved beyond the need of > "industry". We can and do build better software, make > better music and write better books without "industry". > And it's only a question of time when we will start > making better chips and better pharmaceuticals ... > > The the "bazaar" model of production using the internet > makes people more productive than the "cathedral" model > using corporate resources. > > > -- Marcello Perathoner webmaster at gutenberg.org I agree with Marcello that the "industry" as a concept is passe in the same sense that The Stationers' Guilds were made passe by The Gutenberg Press. But don't forget that even though it took them a whole quarter of a millennium, The Stationers quite outlawed the use of The Gutenberg Press by all others. . . . The first copyright act. You can be sure WIPO and "The Industry" have at least, if not more horrendous cards up their sleeves, not for a moment forgetting The US Atty General's request that copyright violation get life imprisonment. . . . As if any white collar crime EVER got that. . . . WHAT a precedent. . .JUST to mention it. . . . mh From hart at pglaf.org Sun Aug 5 08:54:52 2007 From: hart at pglaf.org (Michael Hart) Date: Sun, 5 Aug 2007 08:54:52 -0700 (PDT) Subject: [gutvol-d] FW: harry #7 translated to chinese In-Reply-To: <9c6138c50708032053h59f20ee9qaf4e2ab9024ff093@mail.gmail.com> References: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> <9c6138c50708032053h59f20ee9qaf4e2ab9024ff093@mail.gmail.com> Message-ID: The comments below obviously do not take the facts into much account, as the fast majority still buy, even when something is available free on the Net. Everyone _I_ know is buying Harry Potter 7, and is in fact in the process of buying all the rest, now that the series is complete. All the studies [do your homework] show that quite literally only about 1-2% of such sales are lost. Certainly not the 10-15% claimed by "the industry" in their rants and raves. mh On Sat, 4 Aug 2007, Ricardo F Diogo wrote: > 2007/8/4, John Hagerson : >> >> Why would Sony or Microsoft invest in >> the considerable bandwidth and server capacity required to do this? > > Why does Google? (And Microsoft is already doing it too by the way.) > > > The industry needs to continue a model where willing buyers pay willing >> sellers. If that is through advertising, donations, or a large, well funded >> foundation that compensates authors and builds its reputation by >> distributing literature for "free," we can't devolve into the government >> content provision model. >> > > Willing buyers? I don't see the number of willing buyers raising too > much these days. _I am absolutely sure_ that final consumers will not > want to pay for what they can get for free, even if it's illegal. > Two options: or authors/publishers find a way of using the "free" > concept in their favour or they can cry whatever they want because no > government in the world has the ability to face the problem (even if > they all created a cybercop for each user or prohibited all sorts of > copies). > What authors do today is claiming that the Internet is decreasing the > number of sellings... they don't seem to understand the basics: if > sellings are decreasing is because people don't want to _buy_ their > books!! And they ask for anti-piracy policies so that they can keep > their traditional model while they should be studying a new model > where _piracy isn't just needed_. > Your foundation idea seems nice... Although PGLAF doesn't have to care > about paying royalties, I'm pretty sure that lots of people would > donate us money so that we could pay Stephen King for his works and > release them for free. And again, if the names of the donators were > public I'm sure we'd have lots of banks and all sorts of corporations > donating. > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Sun Aug 5 09:03:30 2007 From: hart at pglaf.org (Michael Hart) Date: Sun, 5 Aug 2007 09:03:30 -0700 (PDT) Subject: [gutvol-d] FW: harry #7 translated to chinese In-Reply-To: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> References: <00ef01c7d630$e36aa5a0$1f12fea9@sarek> Message-ID: On Fri, 3 Aug 2007, John Hagerson wrote: > > And what benefit would the theoretical corporation > receive for buying Stephen King's next "best seller" and > giving it away? I put quotation marks around "best > seller" because only one copy of the book is actually > sold in this model (where a sale is defined as a > voluntary transaction between a willing buyer and a > willing seller). The only "voluntary" parties to the issuance of longer and longer copyrights are "The Industry." The hearings for the 1998 "Mickey Mouse" Copyright Law at which I was planning to testify, turned out to hold forth behind closed doors without public notice, and I only found about about them weeks after the fact. Not only that, but the VOTE was also behind the doors, and a "voice vote" only, so there would be no records, no way to know who sold the public's domain way adown, upon the Potomac River. > Why would Sony or Microsoft invest in the considerable > bandwidth and server capacity required to do this? The > shareholders of these two companies would not be very > happy at the waste of their corporate resources. For the smae reason Apple built iTunes. . .eh? > This sounds like the BBC model of TV where the owner of > every television in Britain pays a government tax and > Channel 1 and Channel 2 are "free." 99 cents isn't quite "free" and no taxes are involved. > The industry needs to continue a model where willing > buyers pay willing sellers. If that is through > advertising, donations, or a large, well funded > foundation that compensates authors and builds its > reputation by distributing literature for "free," we > can't devolve into the government content provision > model. The public domain is SUPPOSED to be free. That's why THEY are killing it off. . . . See Eldred v Ashcroft, a case that started 5-4 and went down at 7-2. . .no chance for the public domain!!! mh > > _______________________________________________ gutvol-d > mailing list gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From richfield at telkomsa.net Sun Aug 5 09:22:36 2007 From: richfield at telkomsa.net (Jon Richfield) Date: Sun, 05 Aug 2007 18:22:36 +0200 Subject: [gutvol-d] Once more unto the list, dear friends... Message-ID: <46B5F94C.7050507@telkomsa.net> Sometime I'll start to get things right, and then my glory shall excoriate... er... excelsiorate... excalibate... celibate... (no! Surely not that!) cerebrate... celebrate... coruscate... Hmmm... getting warmer! Well anyway, something along those lines. Meanwhile I am slowly working away at reducing the number of untried blunders that remain to me. Not that I never repeat a blunder of course, but only when I can't remember it. I recall one time (hmm... actually it was the second time, I can't remember the first, but it was the expression on my neighbour's face that embossed the event on my memory, and convinced me that it must indeed be the second time because she could never have portrayed such bewildered disbelief the first time round.) Anyway, I was only trying to be helpful! Meanwhile, still on my drunkard's walk to functionality, I have the impression that somewhere I saw that one of the acceptable formats for submission is RTF. (Submission to PG, that is!) So: 1 Am I right? 2 If so, are there any strings attached to the use of RTF? Word is not acceptable, it seems, so why RTF? Is RTF just acceptable as a measure of desperation in dealing with those beyond the pale, who still need to be warned off the baits of the Evil Empire? If so, why is Open Office not acceptable? 3 I had considered RTF, but although it is not bad for plain text, its graphics are amazingly bloated. That was why I tried to use HTML instead. David kindly pointed me at a package (pg2html) that apparently produces acceptable HTML, but I am still experimenting. I also still am awaiting connection to broadband, after which on-line checking should become more practical, but for the present, repeated uploading of multi-MB files is a bit of a pain, so I am passing my time in other ways. Cheers, and thanks for the copious and helpful and, all things considered, angelically patient responses that I have learned to expect. Jon From ricardofdiogo at gmail.com Sun Aug 5 11:25:04 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Sun, 5 Aug 2007 19:25:04 +0100 Subject: [gutvol-d] Once more unto the list, dear friends... In-Reply-To: <46B5F94C.7050507@telkomsa.net> References: <46B5F94C.7050507@telkomsa.net> Message-ID: <9c6138c50708051125h5397a07dw1d86a968b42513b1@mail.gmail.com> 2007/8/5, Jon Richfield : > I have the impression that somewhere I saw that one of the acceptable > formats for submission is RTF. (Submission to PG, that is!) > As far as you also send a plain txt, yes it is. > 2 If so, are there any strings attached to the use of RTF? Word is not > acceptable, it seems, so why RTF? Is RTF just acceptable as a measure > of desperation in dealing with those beyond the pale, who still need to > be warned off the baits of the Evil Empire? If so, why is Open Office > not acceptable? I would try keeping the layout as universal as I can. Times New Roman, size 12, justified, bold for chapter titles. You can also send in Open Office files. (This raises a good question though: although being open source, most people will probably prefer RTF which is proprietary but can be read by almost word processors. What a paradox!) From schultzk at uni-trier.de Sun Aug 5 23:59:26 2007 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Mon, 6 Aug 2007 08:59:26 +0200 Subject: [gutvol-d] harry #7 translated to chinese In-Reply-To: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> References: <1034739.27375.XAMJXAlJE1M=.1186123836.squirrel@webmailer.hosteurope.de> Message-ID: Hi All, Am 03.08.2007 um 08:50 schrieb Michael Engel: >> If you do your homework you will find a number of studies >> all pointing out that placing free eBooks online creates >> an increased retail market demand for those books. > > But a decision to make it online available is the right of the > author - > not yours - and not the right of these students. Not quite right. It was their translation, their work. Yes, it was based on someelses work, yet it is their work and their copyright!! Let's look at Shakespeare, definately out of copyright, ( if that is not enough use Plato, Homer, etc.) A book published containing his dramas is copyrighted, but only the form that is the book itself not the works contained. I have mentioned this some time ago. Know if I scan this book published say 2005(2000, or 1980) and put it contents online I am effectively violating copyright law! If I use a book out of copyright I am not even though the Content is the same. If I translate the dramas held within into German I am not violating copyright!!! > > Project Gutenberg takes so much care about not violating the > copyright of > authors but suddenly people on the list find it "cute" that some > Chinese > students violate it so clearly. You do have to admit the idea is cute. They did not want to wait months for an official translation. So they got it done faster. If they are students(acedemics) they can translate and publish inside of the acedemic license. Though I do admit they should get permission or at least tell RK about it. I doubt very much that they were intending to do harm. > > Do you have double standards ? > What standards! Just using all the loop holes in all the laws!!! What about their rights, their copyright to their translation!! It was their intellectual act that did the translation! It is you who has the double standard. regards Keith. From ricardofdiogo at gmail.com Tue Aug 7 10:19:16 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Tue, 7 Aug 2007 18:19:16 +0100 Subject: [gutvol-d] IDPF epub Message-ID: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> Does anyone know if there's already an open source *.epub reader around or in preparation? Ricardo From desrod at gnu-designs.com Tue Aug 7 11:21:20 2007 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Tue, 07 Aug 2007 14:21:20 -0400 Subject: [gutvol-d] IDPF epub In-Reply-To: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> References: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> Message-ID: <1186510880.17659.113.camel@localhost.localdomain> On Tue, 2007-08-07 at 18:19 +0100, Ricardo F Diogo wrote: > Does anyone know if there's already an open source *.epub reader > around or in preparation? This might help: http://www.jedisaber.com/eBooks/tutorial.asp They claim "Adobe Digital Editions" can read it. http://www.adobe.com/products/digitaleditions/ This seems somewhat relevant also: http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=36819 -- David A. Desrosiers desrod at gnu-designs.com setuid at gmail.com http://projects.plkr.org/ Skype...: 860-967-3820 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070807/75d2e5cb/attachment.pgp From ricardofdiogo at gmail.com Tue Aug 7 11:28:04 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Tue, 7 Aug 2007 19:28:04 +0100 Subject: [gutvol-d] IDPF epub In-Reply-To: <1186510880.17659.113.camel@localhost.localdomain> References: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> <1186510880.17659.113.camel@localhost.localdomain> Message-ID: <9c6138c50708071128t555f0487nf3f807fa7d75a6a9@mail.gmail.com> 2007/8/7, David A. Desrosiers : > On Tue, 2007-08-07 at 18:19 +0100, Ricardo F Diogo wrote: > > Does anyone know if there's already an open source *.epub reader > > around or in preparation? > > This might help: > > http://www.jedisaber.com/eBooks/tutorial.asp > > They claim "Adobe Digital Editions" can read it. > > http://www.adobe.com/products/digitaleditions/ > > This seems somewhat relevant also: > > http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=36819 > I've already been in both and I'm reading some *.epub files using Digital Editions. But since I can't find *.epub files anywhere else and can't download the source from Adobe I can't actually realise how good would that format be for us here at PG. From jon at noring.name Tue Aug 7 12:26:52 2007 From: jon at noring.name (Jon Noring) Date: Tue, 7 Aug 2007 13:26:52 -0600 Subject: [gutvol-d] IDPF epub In-Reply-To: <1186510880.17659.113.camel@localhost.localdomain> References: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> <1186510880.17659.113.camel@localhost.localdomain> Message-ID: <1164583476.20070807132652@noring.name> David Desrosiers wrote: > Ricardo F Diogo wrote: >> Does anyone know if there's already an open source *.epub reader >> around or in preparation? > This might help: It is important to understand *what* EPUB is. EPUB is nothing more than an IDPF OPS/OEBPS Publication wrapped into the IDPF OCF Container (essentially a classic zip file.) Thus the proper question to ask is what reading systems out there will properly render an OPS/OEBPS Publication. Currently, there is one released OPS/OEBPS Reading System: Adobe "Digital Editions". Obviously this is not an open source reading system. In addition, OSoft is progressing (so I understand) on their OPS/OEBPS Reading System plugin for dotReader. Their reading system is open source, and in fact welcomes help from developers in speeding up the process. Contact Gary Varnell at OSoft if you are interested in helping out. gvarnell *at* osoft.com . Jon Noring (who, btw, is one of the core IDPF OPS Working Group contributors) From jon at noring.name Tue Aug 7 12:31:29 2007 From: jon at noring.name (Jon Noring) Date: Tue, 7 Aug 2007 13:31:29 -0600 Subject: [gutvol-d] IDPF epub In-Reply-To: <9c6138c50708071128t555f0487nf3f807fa7d75a6a9@mail.gmail.com> References: <9c6138c50708071019s33b0154aid37da04b39baa8ea@mail.gmail.com> <1186510880.17659.113.camel@localhost.localdomain> <9c6138c50708071128t555f0487nf3f807fa7d75a6a9@mail.gmail.com> Message-ID: <1586377490.20070807133129@noring.name> Ricardo wrote: > I've already been in both and I'm reading some *.epub files using > Digital Editions. But since I can't find *.epub files anywhere else > and can't download the source from Adobe I can't actually realise how > good would that format be for us here at PG. As noted in my prior reply, EPUB is an *open standard* ebook format. And as also noted, the question is not to produce EPUB, but to produce an OPS (formerly OEBPS) Publication, an open standard soon to be released by IDPF. OPS is, by and large, XHTML 1.1 based. So, put your content into XHTML 1.1, and the jump to OPS is quite small. Then to produce EPUB, simply zip up the file set following the requirements of the IDPF OCF spec. EPUB is NOT an Adobe proprietary format... Btw, I'll be happy to advise anyone here in producing EPUBs. Jon Noring From brad at chenla.org Tue Aug 7 14:00:15 2007 From: brad at chenla.org (Brad Collins) Date: Tue, 07 Aug 2007 14:00:15 -0700 Subject: [gutvol-d] Bill McCoy (at Adobe) comments on Google, copyright and the Public Domain (excellent) In-Reply-To: <116143487.20070801171854@noring.name> (Jon Noring's message of "Wed, 1 Aug 2007 17:18:54 -0600") References: <116143487.20070801171854@noring.name> Message-ID: Jon Noring writes: > An excellent blog article by Bill McCoy at Adobe: > > http://blogs.adobe.com/billmccoy/2007/08/google_a_glassh.html > > Definitely of interest to the Project Gutenberg and Distributed > Proofreaders folk. > > Jon Noring I've also noticed watermarks on page scans at the Internet Archive (at least for U of Toronto scans) with the IA logo and a copyright notice saying that no one is allowed to index the file for commercial purposes. This is obvious targeted at Google Books trying to exclude them from indexing pages. Most of these watermarked scans are very clearly in the public domain, and those that aren't are not works that Microsoft owns, so this is a odd thing for them to claim. The Internet Archive are supposed to be the good guys. It's disappointing to see them making bogus copyright claims on works in the public domain. b/ -- Brad Collins , Bankwao, Thailand From ricardofdiogo at gmail.com Wed Aug 8 08:15:53 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 8 Aug 2007 16:15:53 +0100 Subject: [gutvol-d] Catalog Message-ID: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> Why don't we display in the catalog pages at least the publisher name and edition date and place bellow the titles of the ebooks? Ricardo From gbnewby at pglaf.org Wed Aug 8 08:24:23 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Wed, 8 Aug 2007 08:24:23 -0700 Subject: [gutvol-d] Catalog In-Reply-To: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> Message-ID: <20070808152423.GA8971@mail.pglaf.org> On Wed, Aug 08, 2007 at 04:15:53PM +0100, Ricardo F Diogo wrote: > Why don't we display in the catalog pages at least the publisher name > and edition date and place bellow the titles of the ebooks? Project Gutenberg is the publisher for all of our stuff, and the edition is part of the metadata. We don't try to encode within-the-book content like publisher etc. as part of our metadata. We don't enforce any adherance to a particular printed edition, so I don't think print sources should be part of our main metadata. It might be interesting to include in some sort of subsidiary database, though. Currently, we don't collect that info in a very robust way (year, city & location are part of the copyright clearance submission request, but not the publisher name). -- Greg From jon at noring.name Wed Aug 8 08:35:03 2007 From: jon at noring.name (Jon Noring) Date: Wed, 8 Aug 2007 09:35:03 -0600 Subject: [gutvol-d] Catalog In-Reply-To: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> Message-ID: <1136876243.20070808093503@noring.name> Ricardo wrote: > Why don't we display in the catalog pages at least the publisher name > and edition date and place bellow the titles of the ebooks? The bibliographic source information for most of the early PG texts (which happens to represent the most popular works in the Public Domain) is missing. (see note below) DP preserves this information, so at least for the DP portion of the PG collection, and for all other new PG texts, this information can be made available to users. And, yes, I support PG adding bibliographic source information to each text catalog entry when that information is known. Jon Noring (Note: the reason for the lack of bibliographic source information of the earlier PG texts has been discussed before. For my take on the matter, which I've been called to task for not explaining "why", refer to a TeleRead blog article I wrote not too long ago: http://www.teleread.org/blog/?p=6174 . Michael Hart says the source information for the early PG texts was stripped out at the advice of an attorney, ca. 1995 or thereabouts. I also have my opinion on how stupid the attorney's advice was, but that's for another time. Of course, I expect a certain someone to jump on this message just to rile the water some. LOL.) From ricardofdiogo at gmail.com Wed Aug 8 08:58:04 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 8 Aug 2007 16:58:04 +0100 Subject: [gutvol-d] Catalog In-Reply-To: <1136876243.20070808093503@noring.name> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <1136876243.20070808093503@noring.name> Message-ID: <9c6138c50708080858r496dce5bpcc954a41da5db89@mail.gmail.com> 2007/8/8, Jon Noring : > > DP preserves this information, so at least for the DP portion of the > PG collection, and for all other new PG texts, this information can be > made available to users. > > And, yes, I support PG adding bibliographic source information to each > text catalog entry when that information is known. > Yes, that was what I was going to suggest. Although PG may be a publisher, we reproduce some editons fully (both using raw page images and DP-produced eboks). So in these cases that info should also be part of the catalog, I guess. Ricardo From jon at noring.name Wed Aug 8 09:22:14 2007 From: jon at noring.name (Jon Noring) Date: Wed, 8 Aug 2007 10:22:14 -0600 Subject: [gutvol-d] Catalog In-Reply-To: <9c6138c50708080858r496dce5bpcc954a41da5db89@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <1136876243.20070808093503@noring.name> <9c6138c50708080858r496dce5bpcc954a41da5db89@mail.gmail.com> Message-ID: <165451507.20070808102214@noring.name> Ricardo wrote: > Jon Noring: >> DP preserves this information, so at least for the DP portion of the >> PG collection, and for all other new PG texts, this information can be >> made available to users. >> >> And, yes, I support PG adding bibliographic source information to each >> text catalog entry when that information is known. > Yes, that was what I was going to suggest. > Although PG may be a publisher, we reproduce some editons fully (both > using raw page images and DP-produced eboks). So in these cases that > info should also be part of the catalog, I guess. Well, I take exception to the view that PG is a publisher. I prefer to think of PG as a "republisher" since the Public Domain is a "finished" product and PG simply takes what's already done and recasts it for a new market. PG is not developing new writers, nor paying advances, nor doing any marketing of individual works for authors, nor paying royalties, etc., etc. So, if the provenance of a particular PG text is known, then I think PG's catalog should state that provenance. Why not? No good reason has been given for holding back known information that many find valuable. If the information is not known, then it is not known and the catalog source information is simply left blank. Is there a good and defendable reason for PG to hide from the public the provenance of a PG text when that provenance is fully known and fully characterizable (e.g., all DP texts fall in this category)? I've not yet heard a good reason, and PG being a "publisher" is a non-reason. (And please, don't start bringing up other "publishers" not doing the same. I heard that a million times with my children coming home from school: "Dad, all the kids at school are doing it!" I think PG should think of itself as *better* than other republishers of the Public Domain. I simply don't understand the resistance to the idea of including provenance info in the catalog entry for PG texts when that information is known.) IMHO of course. Jon Noring From ajhaines at shaw.ca Wed Aug 8 10:06:35 2007 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Wed, 08 Aug 2007 10:06:35 -0700 Subject: [gutvol-d] Publisher info Message-ID: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> In Greg's message in the "Catalog" thread (Aug 8, 2007), he says: > It might be interesting to include in some sort of subsidiary database, > though. Currently, we don't collect that info in a very robust way > (year, city & location are part of the copyright clearance submission > request, but not the publisher name). In fact, there is a field on the clearance submission form for the publisher name. That name is also shown on the various "status" pages on http://copy.pglaf.org. Al From jon at noring.name Wed Aug 8 10:27:49 2007 From: jon at noring.name (Jon Noring) Date: Wed, 8 Aug 2007 11:27:49 -0600 Subject: [gutvol-d] Publisher info In-Reply-To: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> Message-ID: <1898809197.20070808112749@noring.name> Al wrote: > Greg Newby wrote: >> It might be interesting to include in some sort of subsidiary database, >> though. Currently, we don't collect that info in a very robust way >> (year, city & location are part of the copyright clearance submission >> request, but not the publisher name). > In fact, there is a field on the clearance submission form for the publisher > name. That name is also shown on the various "status" pages on > http://copy.pglaf.org. What I think would also be good to include, when available, would be the scan for the source book title and copyright page(s). This has the added advantage of supporting the copyright clearance process in case anyone ever challenges any particular etext as to whether the source was truly Public Domain. So long as the source book is truly Public Domain, having more information rather than less actually helps with defending against copyright infringement, and in some cases may stop anyone from even considering it. And the public would love to see the original title page scans! (Some title pages are truly works of art.) Jon Noring From marcello at perathoner.de Wed Aug 8 10:30:01 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed, 08 Aug 2007 19:30:01 +0200 Subject: [gutvol-d] Catalog In-Reply-To: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> Message-ID: <46B9FD99.5090400@perathoner.de> Ricardo F Diogo wrote: > Why don't we display in the catalog pages at least the publisher name > and edition date and place bellow the titles of the ebooks? Because nobody took the trouble of entering them. The catalog software already can do this and more. See: http://www.gutenberg.org/etext/16243 -- Marcello Perathoner webmaster at gutenberg.org From marcello at perathoner.de Wed Aug 8 10:36:39 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed, 08 Aug 2007 19:36:39 +0200 Subject: [gutvol-d] Publisher info In-Reply-To: <1898809197.20070808112749@noring.name> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> Message-ID: <46B9FF27.4080900@perathoner.de> Jon Noring wrote: > What I think would also be good to include, when available, would be the > scan for the source book title and copyright page(s). Also can do this. See: http://www.gutenberg.org/etext/8789 -- Marcello Perathoner webmaster at gutenberg.org From jon at noring.name Wed Aug 8 10:42:56 2007 From: jon at noring.name (Jon Noring) Date: Wed, 8 Aug 2007 11:42:56 -0600 Subject: [gutvol-d] Publisher info In-Reply-To: <46B9FF27.4080900@perathoner.de> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> Message-ID: <1869879678.20070808114256@noring.name> Marcello wrote: > Jon Noring wrote: >> What I think would also be good to include, when available, would be the >> scan for the source book title and copyright page(s). > Also can do this. See: > > http://www.gutenberg.org/etext/8789 Cooler than cool! One idea to consider for the future: provide a thumbnail of the original title page scan so when that is clicked it brings up the larger size image. Jon Noring From shabam.dp at gmail.com Wed Aug 8 11:14:41 2007 From: shabam.dp at gmail.com (Jason Isbell (shabam)) Date: Wed, 8 Aug 2007 11:14:41 -0700 Subject: [gutvol-d] Catalog In-Reply-To: <46B9FD99.5090400@perathoner.de> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <46B9FD99.5090400@perathoner.de> Message-ID: <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> If one was willing to help add stuff to the catalog, where the fields already exist, how would one go about doing this? Jason On 8/8/07, Marcello Perathoner wrote: > > Ricardo F Diogo wrote: > > > Why don't we display in the catalog pages at least the publisher name > > and edition date and place bellow the titles of the ebooks? > > Because nobody took the trouble of entering them. > > The catalog software already can do this and more. See: > > http://www.gutenberg.org/etext/16243 > > > > -- > Marcello Perathoner > webmaster at gutenberg.org > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070808/90eb4b15/attachment.htm From marcello at perathoner.de Wed Aug 8 11:38:55 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed, 08 Aug 2007 20:38:55 +0200 Subject: [gutvol-d] Catalog In-Reply-To: <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <46B9FD99.5090400@perathoner.de> <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> Message-ID: <46BA0DBF.8010501@perathoner.de> Jason Isbell (shabam) wrote: > If one was willing to help add stuff to the catalog, where the fields > already exist, how would one go about doing this? One would backchannel me to obtain a login/password to the catalog system and then coordinate with Andrew Sly and the other catalog people on the gutcat mailing list. -- Marcello Perathoner webmaster at gutenberg.org From ricardofdiogo at gmail.com Wed Aug 8 12:23:15 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 8 Aug 2007 20:23:15 +0100 Subject: [gutvol-d] Catalog In-Reply-To: <46BA0DBF.8010501@perathoner.de> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <46B9FD99.5090400@perathoner.de> <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> <46BA0DBF.8010501@perathoner.de> Message-ID: <9c6138c50708081223k1db95e16i919ee5ed622a4d01@mail.gmail.com> 2007/8/8, Marcello Perathoner : > One would backchannel me to obtain a login/password to the catalog > system and then coordinate with Andrew Sly and the other catalog people > on the gutcat mailing list. > I volunteer for re-cataloguing the Portuguese etexts. Please send me the login/pass. From marcello at perathoner.de Wed Aug 8 13:03:37 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed, 08 Aug 2007 22:03:37 +0200 Subject: [gutvol-d] Catalog In-Reply-To: <9c6138c50708081223k1db95e16i919ee5ed622a4d01@mail.gmail.com> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <46B9FD99.5090400@perathoner.de> <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> <46BA0DBF.8010501@perathoner.de> <9c6138c50708081223k1db95e16i919ee5ed622a4d01@mail.gmail.com> Message-ID: <46BA2199.1070505@perathoner.de> Ricardo F Diogo wrote: > I volunteer for re-cataloguing the Portuguese etexts. Please send me > the login/pass. Sent. Before you start, make sure you coordinate with Andrew Sly: sly at victoria.tc.ca For English titles we follow the Library of Congress catalog. Don't know if that makes sense for Portuguese titles. -- Marcello Perathoner webmaster at gutenberg.org From ricardofdiogo at gmail.com Wed Aug 8 13:09:01 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 8 Aug 2007 21:09:01 +0100 Subject: [gutvol-d] Catalog In-Reply-To: <46BA2199.1070505@perathoner.de> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <46B9FD99.5090400@perathoner.de> <1b68e26b0708081114o5932d388h6838dce9598cce99@mail.gmail.com> <46BA0DBF.8010501@perathoner.de> <9c6138c50708081223k1db95e16i919ee5ed622a4d01@mail.gmail.com> <46BA2199.1070505@perathoner.de> Message-ID: <9c6138c50708081309r7f7298a9s8a38c9357d51bac8@mail.gmail.com> Sure. Thanks, Don't know. I'll discuss it in the proper mailing list. 2007/8/8, Marcello Perathoner : > Ricardo F Diogo wrote: > > > I volunteer for re-cataloguing the Portuguese etexts. Please send me > > the login/pass. > > Sent. > > Before you start, make sure you coordinate with Andrew Sly: > sly at victoria.tc.ca > > For English titles we follow the Library of Congress catalog. Don't know > if that makes sense for Portuguese titles. > > > > > -- > Marcello Perathoner > webmaster at gutenberg.org > > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From jmdyck at ibiblio.org Wed Aug 8 15:03:53 2007 From: jmdyck at ibiblio.org (Michael Dyck) Date: Wed, 08 Aug 2007 15:03:53 -0700 Subject: [gutvol-d] non-adherence to a particular printed edition In-Reply-To: <20070808152423.GA8971@mail.pglaf.org> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <20070808152423.GA8971@mail.pglaf.org> Message-ID: <46BA3DC9.8000507@ibiblio.org> Greg Newby wrote: > We don't enforce any adherance to a particular printed edition, As I understand it, if someone ever claims that one of PG's texts is actually a copyrighted work, someone at PG can pull out the Title Page & Verso that supposedly prove that the work is in the public domain (in the USA). But if PG doesn't actually assert that the text adheres to the printed edition depicted in the TP+V, doesn't that weaken the legal defense? -Michael From walter.van.holst at xs4all.nl Wed Aug 8 15:05:00 2007 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Thu, 09 Aug 2007 00:05:00 +0200 Subject: [gutvol-d] Publisher info In-Reply-To: <1869879678.20070808114256@noring.name> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> Message-ID: <46BA3E0C.1020008@xs4all.nl> Jon Noring wrote: > One idea to consider for the future: provide a thumbnail of the > original title page scan so when that is clicked it brings up the > larger size image. Another suggestion, if I may dare to make one: have some room for a short description of the book. This would benefit those who try to browse the Gutenberg catalog. Regards, Walter From gbnewby at pglaf.org Wed Aug 8 15:24:05 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Wed, 8 Aug 2007 15:24:05 -0700 Subject: [gutvol-d] non-adherence to a particular printed edition In-Reply-To: <46BA3DC9.8000507@ibiblio.org> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <20070808152423.GA8971@mail.pglaf.org> <46BA3DC9.8000507@ibiblio.org> Message-ID: <20070808222405.GA19021@mail.pglaf.org> On Wed, Aug 08, 2007 at 03:03:53PM -0700, Michael Dyck wrote: > Greg Newby wrote: > > We don't enforce any adherance to a particular printed edition, > > As I understand it, if someone ever claims that one of PG's texts is > actually a copyrighted work, someone at PG can pull out the Title Page & > Verso that supposedly prove that the work is in the public domain (in > the USA). Yes, but we don't enforce adherance, and do have a number of frankentexts that have benefitted from different sources. There is a definite trust with producers that they're not grabbing stuff from copyrighted sources for their public domain content. For example, modern imagery or a new preface. > But if PG doesn't actually assert that the text adheres to the printed > edition depicted in the TP+V, doesn't that weaken the legal defense? Theoretically, maybe... but for the most part, the different variations on a book (if variations exist) are all public domain anyway. The main exception I can think of is new modern translations, such as for Voltaire's Candide. The new translations get a new copyright, even though they are often quite similar to the old. When we did Ulysses, I found no changes at all from the original 1921 edition (published in Paris) to any modern edition. That's a counter-example, where I suspect publishers/editors were afraid to make changes to Joyce's text. More frequently, new content is either value-adding (such as a preface, index, footnotes...) or trivial and not eligible for a new copyright (spelling changes, punctuation). Now that Marcello pointed out the features of the PG catalog (which I had forgotten about, even though Dante's Divine Comedy is a favorite!), we can look at including more metadata in the catalog. It already can appear in the eBook, if the producer chooses to put it there. But that information does not get included in the metadata (the part at the top of the eBook files) supplied by the submitter. -- Greg From desrod at gnu-designs.com Wed Aug 8 18:38:46 2007 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Wed, 08 Aug 2007 21:38:46 -0400 Subject: [gutvol-d] Keeping up with all of these ebook formats Message-ID: <1186623526.2435.71.camel@localhost.localdomain> I saw this on another list, and thought it would be appropriate here, based on all of our recent discussions about the various sundry ebook formats, readers and incompatibilities (this is benign and work-safe). http://www.youtube.com/watch?v=xFAWR6hzZek -- David A. Desrosiers desrod at gnu-designs.com setuid at gmail.com http://projects.plkr.org/ Skype...: 860-967-3820 From robert_marquardt at gmx.de Wed Aug 8 21:33:35 2007 From: robert_marquardt at gmx.de (Robert Marquardt) Date: Thu, 09 Aug 2007 06:33:35 +0200 Subject: [gutvol-d] Publisher info In-Reply-To: <46BA3E0C.1020008@xs4all.nl> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> Message-ID: On Thu, 09 Aug 2007 00:05:00 +0200, you wrote: >Another suggestion, if I may dare to make one: have some room for a >short description of the book. This would benefit those who try to >browse the Gutenberg catalog. That is a hard one unless we find a free database already containing such short descriptions already. If we find one i definitely support the idea to integrate the data. -- Robert Marquardt (Team JEDI) http://delphi-jedi.org From robert_marquardt at gmx.de Wed Aug 8 21:44:11 2007 From: robert_marquardt at gmx.de (Robert Marquardt) Date: Thu, 09 Aug 2007 06:44:11 +0200 Subject: [gutvol-d] Keeping up with all of these ebook formats In-Reply-To: <1186623526.2435.71.camel@localhost.localdomain> References: <1186623526.2435.71.camel@localhost.localdomain> Message-ID: On Wed, 08 Aug 2007 21:38:46 -0400, you wrote: >I saw this on another list, and thought it would be appropriate here, >based on all of our recent discussions about the various sundry ebook >formats, readers and incompatibilities (this is benign and work-safe). > >http://www.youtube.com/watch?v=xFAWR6hzZek Marcello, can we get the YouTube extension for the Wiki? We have now two Gutenberg-related videos available and i think we should present them on a bookshelf page. -- Robert Marquardt (Team JEDI) http://delphi-jedi.org From jared.buck at gmail.com Wed Aug 8 21:41:43 2007 From: jared.buck at gmail.com (Jared Buck) Date: Wed, 8 Aug 2007 21:41:43 -0700 Subject: [gutvol-d] Keeping up with all of these ebook formats In-Reply-To: References: <1186623526.2435.71.camel@localhost.localdomain> Message-ID: Sounds like a good idea to me. Jared (changed email to this one) On 8/8/07, Robert Marquardt wrote: > On Wed, 08 Aug 2007 21:38:46 -0400, you wrote: > > >I saw this on another list, and thought it would be appropriate here, > >based on all of our recent discussions about the various sundry ebook > >formats, readers and incompatibilities (this is benign and work-safe). > > > >http://www.youtube.com/watch?v=xFAWR6hzZek > > Marcello, can we get the YouTube extension for the Wiki? We have now two Gutenberg-related videos available and i think > we should present them on a bookshelf page. > -- > Robert Marquardt (Team JEDI) http://delphi-jedi.org > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From walter.van.holst at xs4all.nl Wed Aug 8 23:55:50 2007 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Thu, 09 Aug 2007 08:55:50 +0200 Subject: [gutvol-d] Publisher info In-Reply-To: References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> Message-ID: <46BABA76.3050508@xs4all.nl> Robert Marquardt wrote: >> Another suggestion, if I may dare to make one: have some room for a >> short description of the book. This would benefit those who try to >> browse the Gutenberg catalog. > > That is a hard one unless we find a free database already containing such short descriptions already. If we find one i > definitely support the idea to integrate the data. Build it and the volunteers to provide the data are likely to come. There are a lot of hidden gems in the Gutenberg catalog, a few weeks ago it took me almost an hour to find a book about a subject I _knew_ was available through Gutenberg, I just didn't know the title or the author. Regards, Walter From marcello at perathoner.de Thu Aug 9 05:13:55 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu, 09 Aug 2007 14:13:55 +0200 Subject: [gutvol-d] Keeping up with all of these ebook formats In-Reply-To: References: <1186623526.2435.71.camel@localhost.localdomain> Message-ID: <46BB0503.2090501@perathoner.de> Robert Marquardt wrote: > Marcello, can we get the YouTube extension for the Wiki? We have now > two Gutenberg-related videos available and i think we should present > them on a bookshelf page. Why don't you just link to it? I don't know if it is even legal to copy this video. -- Marcello Perathoner webmaster at gutenberg.org From shabam.dp at gmail.com Thu Aug 9 10:23:42 2007 From: shabam.dp at gmail.com (Jason Isbell (shabam)) Date: Thu, 9 Aug 2007 10:23:42 -0700 Subject: [gutvol-d] Publisher info In-Reply-To: <46BABA76.3050508@xs4all.nl> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> Message-ID: <1b68e26b0708091023h6024f1ccq7fd5a257d45123cb@mail.gmail.com> In MARC, there is "520 Summary, etc. note". We do have the ability to add these into the catalog already. I also get annoyed at not being able to find things on a subject, but I think that filling the Subject and LoC fields will be more helpful here. Of course, filling all of them is even better. The more information we have in the catalog, the better, as far as I'm concerned. Jason On 8/8/07, Walter van Holst wrote: > Robert Marquardt wrote: > >> Another suggestion, if I may dare to make one: have some room for a > >> short description of the book. This would benefit those who try to > >> browse the Gutenberg catalog. > > > > That is a hard one unless we find a free database already containing such short descriptions already. If we find one i > > definitely support the idea to integrate the data. > > Build it and the volunteers to provide the data are likely to come. > There are a lot of hidden gems in the Gutenberg catalog, a few weeks ago > it took me almost an hour to find a book about a subject I _knew_ was > available through Gutenberg, I just didn't know the title or the author. > > Regards, > > Walter > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From greg at durendal.org Fri Aug 10 07:13:23 2007 From: greg at durendal.org (Greg Weeks) Date: Fri, 10 Aug 2007 10:13:23 -0400 (EDT) Subject: [gutvol-d] Blackmask Message-ID: Did anyone else notice that Blackmask is back? I think it's the same person, but he's using the Munsey name now as well as the blackmask web site. -- Greg Weeks http://durendal.org:8080/greg/ From robert_marquardt at gmx.de Fri Aug 10 07:55:20 2007 From: robert_marquardt at gmx.de (Robert Marquardt) Date: Fri, 10 Aug 2007 16:55:20 +0200 Subject: [gutvol-d] Blackmask In-Reply-To: References: Message-ID: On Fri, 10 Aug 2007 10:13:23 -0400 (EDT), you wrote: > >Did anyone else notice that Blackmask is back? > >I think it's the same person, but he's using the Munsey name now as well >as the blackmask web site. I know, but i forgot to update the Blackmask references in the wiki. -- Robert Marquardt (Team JEDI) http://delphi-jedi.org From sly at victoria.tc.ca Sat Aug 11 14:45:13 2007 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat, 11 Aug 2007 14:45:13 -0700 (PDT) Subject: [gutvol-d] Catalog In-Reply-To: <20070808152423.GA8971@mail.pglaf.org> References: <9c6138c50708080815o2a6ea736n7e8d5cf3e184ef3c@mail.gmail.com> <20070808152423.GA8971@mail.pglaf.org> Message-ID: On Wed, 8 Aug 2007, Greg Newby wrote: > On Wed, Aug 08, 2007 at 04:15:53PM +0100, Ricardo F Diogo wrote: > > Why don't we display in the catalog pages at least the publisher name > > and edition date and place bellow the titles of the ebooks? > > Project Gutenberg is the publisher for all of our stuff, > and the edition is part of the metadata. > > We don't try to encode within-the-book content like publisher > etc. as part of our metadata. Yes. For one thing, there is already enough confusion in the "metadata" of what is getting posted. I would not suggest having volunteers try to add more which would likely result in more examples that would have professional catalogers running away screaming. :) With that said, there are plently of good reasons to save this type of thing if an accurate way can be found to do it. One thing about the TEI header that I have come to really appreciate is that there are two totaly separate sections for description of the digital item on hand, and description of the _source_ that it was transcribed from. At first glance you see a lot of redundany and repetition here, but there are places where this is indispensible. Anything like publisher of source, number of pages in source, date of publication of source, and on occasions even title of source[1], do not belong in the catalog as if they are describing the item at hand (the PG text). [1] For instance, PG#17649, which is a digital transcription of a 1984 edition of a 1979 reprint of a 1901 facsimile reprint of all four issues of the 1850 periodical 'The Germ'. Anyway, if you wanted a nice full title statement of the source I used, you might have something like: THE GERM : Thoughts towards Nature in Poetry, Literature and Art, BEING A FACSIMILE REPRINT OF THE LITERARY ORGAN OF THE PRE-RAPHAELITE BROTHERHOOD, PUBLISHED IN 1850; WITH AN INTRODUCTION BY WILLIAM MICHAEL ROSSETTI But this is not applicable to the PG text, because the PG text is not a "facsimile reprint". Andrew From sly at victoria.tc.ca Sat Aug 11 15:12:41 2007 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat, 11 Aug 2007 15:12:41 -0700 (PDT) Subject: [gutvol-d] Publisher info In-Reply-To: <46BABA76.3050508@xs4all.nl> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> Message-ID: On Thu, 9 Aug 2007, Walter van Holst wrote: > > Build it and the volunteers to provide the data are likely to come. > There are a lot of hidden gems in the Gutenberg catalog, a few weeks ago > it took me almost an hour to find a book about a subject I _knew_ was > available through Gutenberg, I just didn't know the title or the author. Yes, all these questions are the same a asked myslef years ago. These varied discussions lately bring up the question "what do volunteers really want the PG catalog to be?" Something vaguly resembling a traditional library catalog? Something more open, which contains whatever people want to put in it? I believe the PG catalog has already had its share of people tinkering a little bit here, and then a little bit there, and then losing interest. The idea I can't seem to get out of my head is that of having some kind of wiki page for each item which could handle the diverse things volunteers express interest in much better than the current setup. Andrew From sly at victoria.tc.ca Sat Aug 11 15:15:26 2007 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat, 11 Aug 2007 15:15:26 -0700 (PDT) Subject: [gutvol-d] Publisher info In-Reply-To: <1b68e26b0708091023h6024f1ccq7fd5a257d45123cb@mail.gmail.com> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> <1b68e26b0708091023h6024f1ccq7fd5a257d45123cb@mail.gmail.com> Message-ID: The problem I have with saying that we're using MARC fields and so forth is that it somehow gives the impression that the PG catalog follows standards for library science. However, it does not. (and it its current form I don't see any way that it ever could.) Andrew On Thu, 9 Aug 2007, Jason Isbell (shabam) wrote: > In MARC, there is "520 Summary, etc. note". We do have the ability to > add these into the catalog already. > > I also get annoyed at not being able to find things on a subject, but > I think that filling the Subject and LoC fields will be more helpful > here. Of course, filling all of them is even better. The more > information we have in the catalog, the better, as far as I'm > concerned. > > Jason > > On 8/8/07, Walter van Holst wrote: > > Robert Marquardt wrote: > > >> Another suggestion, if I may dare to make one: have some room for a > > >> short description of the book. This would benefit those who try to > > >> browse the Gutenberg catalog. > > > > > > That is a hard one unless we find a free database already containing such short descriptions already. If we find one i > > > definitely support the idea to integrate the data. > > > > Build it and the volunteers to provide the data are likely to come. > > There are a lot of hidden gems in the Gutenberg catalog, a few weeks ago > > it took me almost an hour to find a book about a subject I _knew_ was > > available through Gutenberg, I just didn't know the title or the author. > From klofstrom at gmail.com Sat Aug 11 15:27:11 2007 From: klofstrom at gmail.com (Karen Lofstrom) Date: Sat, 11 Aug 2007 12:27:11 -1000 Subject: [gutvol-d] Publisher info In-Reply-To: References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> Message-ID: <1e8e65080708111527s436c09dn73c9d97aace3f6c9@mail.gmail.com> On 8/11/07, Andrew Sly wrote: > The idea I can't seem to get out of my head is that of > having some kind of wiki page for each item which could > handle the diverse things volunteers express interest > in much better than the current setup. Yes! Space isn't limited to a catalogue card any longer. Wiki space is computer searchable. It could also be a place for people to discuss the book, or submit articles about the book. Let's try it. However, there are some flaws in the Wikipedia model -- flaws that drove me, a long-time editor, out. Let's not repeat them. No anonymous editing. Let people put their real-life identity behind their words and their work. If there's some way of confirming that, then we can't have sock-puppetry. The whole world doesn't need to know who's behind a username, but the admins need to know. Better social controls. A quicker process for ejecting troublemakers. Are there wikis out there that work better than the parent Wikipedia, wikis whose social controls we could imitate? As I said, let's try it. It's just electrons, and if we find it doesn't work, we can shut it down. -- Karen Lofstrom From sly at victoria.tc.ca Sat Aug 11 15:32:44 2007 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat, 11 Aug 2007 15:32:44 -0700 (PDT) Subject: [gutvol-d] Publisher info In-Reply-To: <1e8e65080708111527s436c09dn73c9d97aace3f6c9@mail.gmail.com> References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> <1e8e65080708111527s436c09dn73c9d97aace3f6c9@mail.gmail.com> Message-ID: Well, at one point a while ago Marcello was going to implement it. However, I believe that he found a problem right away. The only unique identifier that we have for each item is its PG number. So when you wanto to link to something or using a "catagories" feature (which is what I was most interested in at the time) what you have to work with is just plain digits, rather than an understandable title. Andrew On Sat, 11 Aug 2007, Karen Lofstrom wrote: > On 8/11/07, Andrew Sly wrote: > > > The idea I can't seem to get out of my head is that of > > having some kind of wiki page for each item which could > > handle the diverse things volunteers express interest > > in much better than the current setup. > > Yes! Space isn't limited to a catalogue card any longer. Wiki space is > computer searchable. It could also be a place for people to discuss > the book, or submit articles about the book. Let's try it. > > However, there are some flaws in the Wikipedia model -- flaws that > drove me, a long-time editor, out. Let's not repeat them. No anonymous > editing. Let people put their real-life identity behind their words > and their work. If there's some way of confirming that, then we can't > have sock-puppetry. The whole world doesn't need to know who's behind > a username, but the admins need to know. Better social controls. A > quicker process for ejecting troublemakers. Are there wikis out there > that work better than the parent Wikipedia, wikis whose social > controls we could imitate? > > As I said, let's try it. It's just electrons, and if we find it > doesn't work, we can shut it down. > > From klofstrom at gmail.com Sat Aug 11 15:49:49 2007 From: klofstrom at gmail.com (Karen Lofstrom) Date: Sat, 11 Aug 2007 12:49:49 -1000 Subject: [gutvol-d] Publisher info In-Reply-To: References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> <1e8e65080708111527s436c09dn73c9d97aace3f6c9@mail.gmail.com> Message-ID: <1e8e65080708111549v3b393bd1vefee4d775172eb7d@mail.gmail.com> On 8/11/07, Andrew Sly wrote: > Well, at one point a while ago Marcello was going to implement it. > However, I believe that he found a problem right away. The only > unique identifier that we have for each item is its PG number. > So when you want to link to something or using a "catagories" > feature (which is what I was most interested in at the time) > what you have to work with is just plain digits, rather than > an understandable title. I'm in no way a programmer, but I have vague recollections of Unix supporting symbolic links. One file can have different names. So, can we set up a different naming system and symbolically link it to the numbers? -- Karen Lofstrom From shabam.dp at gmail.com Mon Aug 13 08:27:26 2007 From: shabam.dp at gmail.com (shabam) Date: Mon, 13 Aug 2007 08:27:26 -0700 Subject: [gutvol-d] Publisher info In-Reply-To: References: <000d01c7d9de$7a825f40$6401a8c0@ahainesp2400> <1898809197.20070808112749@noring.name> <46B9FF27.4080900@perathoner.de> <1869879678.20070808114256@noring.name> <46BA3E0C.1020008@xs4all.nl> <46BABA76.3050508@xs4all.nl> Message-ID: <1ac896090708130827t4aa43b52if33588d8420eeea1@mail.gmail.com> > These varied discussions lately bring up the question "what do > volunteers really want the PG catalog to be?" > > Something vaguly resembling a traditional library catalog? Yes. If this would make it possible for libraries link into our catalog. That would provide greater readership of our books. > Something more open, which contains whatever people want to > put in it? Yes. Much of the information we have now is rather skimpy, sometimes non-existent. Some of us love the books on PG, and while we might not all want to spend a bunch of time working on improving the catalog, we might want to improve one or two books. Or maybe the ones we personally worked on. There might still be certain items that are not editable by the regular user (Title, Author), but allowing more people to edit other things (Subject tags, reviews, summary, imprint) would give us a richer catalog. > I believe the PG catalog has already had its share of people > tinkering a little bit here, and then a little bit there, > and then losing interest. Yes, but they still improved it. They work for a while, lose interest, and someone else does the same. We don't need to have one person working on it all the time. > The idea I can't seem to get out of my head is that of > having some kind of wiki page for each item which could > handle the diverse things volunteers express interest > in much better than the current setup. I like wikis. Jason From gbnewby at pglaf.org Sun Aug 19 19:11:02 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Sun, 19 Aug 2007 19:11:02 -0700 Subject: [gutvol-d] New search engine; feedback sought Message-ID: <20070820021102.GA31582@mail.pglaf.org> Gutenberg volunteers & other interested folks: Here's something new to try. It's a search engine that includes fielded searches, plus text, for the PG content. It enables searching the files (as Google does), and parses the catalog for database-style searches (like our Yahoo search). It's fresh. Please email Anna Tothfalusi with any feedback or questions. The site: http://bookmine.tesuji.eu/gutenberg/ The interface is sparse, but the "help" link shows syntax examples..."Advanced search" offers some drop-downs. Ideas for the interface, search functionality, etc. would be welcome. I hope to make this one of the PG search options at gutenberg.org. Thanks in advance for feedback & ideas. Please be sure to Cc: Anna Tothfalusi -- Greg From ricardofdiogo at gmail.com Tue Aug 21 06:18:20 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Tue, 21 Aug 2007 14:18:20 +0100 Subject: [gutvol-d] Errata Team Message-ID: <9c6138c50708210618j30d0b3dep17c3e59818c9b7fa@mail.gmail.com> Is there any way I can help the errata team? I've sent a report more than one month ago and still haven't got any answer. I guess I could take care of Portuguese erratas. Ricardo From gbnewby at pglaf.org Tue Aug 21 08:23:11 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Tue, 21 Aug 2007 08:23:11 -0700 Subject: [gutvol-d] Errata Team In-Reply-To: <9c6138c50708210618j30d0b3dep17c3e59818c9b7fa@mail.gmail.com> References: <9c6138c50708210618j30d0b3dep17c3e59818c9b7fa@mail.gmail.com> Message-ID: <20070821152311.GC31139@mail.pglaf.org> I sent a note to Ricardo about this. On Tue, Aug 21, 2007 at 02:18:20PM +0100, Ricardo F Diogo wrote: > Is there any way I can help the errata team? I've sent a report more > than one month ago and still haven't got any answer. I guess I could > take care of Portuguese erratas. > > Ricardo > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From Bowerbird at aol.com Tue Aug 21 10:03:12 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 21 Aug 2007 13:03:12 EDT Subject: [gutvol-d] Errata Team Message-ID: greg said: > I sent a note to Ricardo about this. why the lack of transparency? this is a weak point in your infrastructure, one of the worst. act and react publically. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070821/c8a12b9b/attachment.htm From gbnewby at pglaf.org Tue Aug 21 10:47:48 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Tue, 21 Aug 2007 10:47:48 -0700 Subject: [gutvol-d] Errata Team In-Reply-To: References: Message-ID: <20070821174748.GA869@mail.pglaf.org> On Tue, Aug 21, 2007 at 01:03:12PM -0400, Bowerbird at aol.com wrote: > greg said: > > I sent a note to Ricardo about this. > > why the lack of transparency? > > this is a weak point in your infrastructure, > one of the worst. act and react publically. > > -bowerbird ???? A guy asks to get involved, I sent him a note about what to do. You got a problem with that? If you want to know how errata are handled, read about it in the FAQ, R.26. What is it you think is not transparent? I respond to hundreds of email messages concerning PG per week. You think they should all be Cc'd to some public address, too? -- Greg From ricardofdiogo at gmail.com Tue Aug 21 11:47:23 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Tue, 21 Aug 2007 19:47:23 +0100 Subject: [gutvol-d] People Behind PG Message-ID: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> I'm trying to write a more complete "People Behind PG" page for the wiki. Please edit my user page in case you have more accurate info: http://www.gutenberg.org/wiki/User:Ricdiogo/behind Ricardo From Bowerbird at aol.com Tue Aug 21 12:02:36 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 21 Aug 2007 15:02:36 EDT Subject: [gutvol-d] Errata Team Message-ID: greg said: > A guy asks to get involved, > I sent him a note about what to do.? > You got a problem with that? you left out a few important details in this summary. > If you want to know how errata are handled, > read about it in the FAQ, R.26.? > What is it you think is not transparent? i know "how" the errata are "handled". i also know that i often hear from people who have submitted reports that it seems to them that the reaction-time is very slow. (that's precisely why i described it as one of the worst weak points in your entire infrastructure. people would accept flaws in your e-texts much more readily if your error-correction was faster and more public and generally more transparent, especially now that we're in the age of wiki wiki.) so it seems to me -- correct me if i'm wrong -- that you need some more help in this regard... thus, when the world hands you a chance to tell a large number of your supporters how to help, you should use that opportunity more wisely... > I respond to hundreds of email messages > concerning PG per week. You think they should > all be Cc'd to some public address, too? ok, let's review. ricardo made his request publicly. and you made a public "response" saying that you "handled" the matter by sending him a backchannel. so you made a frontchannel and a backchannel reply. except the frontchannel reply -- the one that hundreds of people interested enough in project gutenberg to subscribe to the mailing list -- was totally uninformative. again, why not use the opportunity to let us subscribers know how we could help in the errata effort? even if it's info listed somewhere else, why not repeat it here, now? (or at the very least provide a pointer to it? _the_full_url_.) so yes, i'm saying if an inquiry is made in public, like this, you _should_ respond to it in public. that's transparency. and heck, even for the inquiries that are made in private, you _might_ well want to make a response in public too... after all, that's what f.a.q. are for -- so you can answer a question once, publicly, instead of over and over privately. continuing to respond privately is a mistake in prioritization. because, honestly, if you're answering _hundreds_ of e-mails every _week_, i don't think you're using your time in the best way possible, and you should find a way to reduce that burden so that you can instead pay attention to more important matters -- the kind of things that newby-and-only-newby can do -- instead of wearing yourself out on the less-important stuff... this is constructive criticism, greg. be smart, and view it that way. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070821/909b893f/attachment.htm From gbnewby at pglaf.org Tue Aug 21 13:45:55 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Tue, 21 Aug 2007 13:45:55 -0700 Subject: [gutvol-d] Errata Team In-Reply-To: References: Message-ID: <20070821204555.GA3516@mail.pglaf.org> On Tue, Aug 21, 2007 at 03:02:36PM -0400, Bowerbird at aol.com wrote: > greg said: > > A guy asks to get involved, > > I sent him a note about what to do.? > > You got a problem with that? > > you left out a few important details in this summary. > > > > If you want to know how errata are handled, > > read about it in the FAQ, R.26.? > > What is it you think is not transparent? > > i know "how" the errata are "handled". > > i also know that i often hear from people > who have submitted reports that it seems > to them that the reaction-time is very slow. > > (that's precisely why i described it as one of the > worst weak points in your entire infrastructure. > people would accept flaws in your e-texts much > more readily if your error-correction was faster > and more public and generally more transparent, > especially now that we're in the age of wiki wiki.) I don't see how it's not public. R.26 is quite clear & detailed. When you submit an errata report, here is the instant response you get: Your message has been received. This automated reply is to let you know we appreciate your error report, and will act on it when time allows. Because Project Gutenberg is short-staffed, we often cannot handle errors as quickly as we'd like. This is mostly because none of the errata team can resist doing a full "tune up" on any file we look at -- often finding more problems, and sometimes doing a substantial re-working of the eBook's file (such as, creating a new HTML version). Your report is safe, and will be tended to as soon as possible. If you would like to follow up with further details, you can respond to this message and your follow up message will be grouped with your original message. If you have further questions, please email our "help" address. You'll get another automated response, but we usually answer questions within a few days. We also have mailing lists for discussion -- all described at www.gutenberg.org Thanks again for your error report. -- The Project Gutenberg Volunteer Errata Team > so it seems to me -- correct me if i'm wrong -- > that you need some more help in this regard... Of course we need more help. People interested in working on errata can get in touch with me, or email pgww AT lists.pglaf.org > thus, when the world hands you a chance to tell > a large number of your supporters how to help, > you should use that opportunity more wisely... > > > > I respond to hundreds of email messages > > concerning PG per week. You think they should > > all be Cc'd to some public address, too? > > ok, let's review. ricardo made his request publicly. > and you made a public "response" saying that you > "handled" the matter by sending him a backchannel. > > so you made a frontchannel and a backchannel reply. > > except the frontchannel reply -- the one that hundreds > of people interested enough in project gutenberg to > subscribe to the mailing list -- was totally uninformative. > > again, why not use the opportunity to let us subscribers > know how we could help in the errata effort? even if it's > info listed somewhere else, why not repeat it here, now? > (or at the very least provide a pointer to it? _the_full_url_.) I have higher expectations of gutvol-d, and presume that people already know where our FAQ is and how to find R.26. Or, can use Google. Anyway, http://www.gutenberg.org/wiki/Gutenberg:Readers%27_FAQ#R.26._I.27ve_found_some_obvious_typos_in_a_Project_Gutenberg_text._How_should_I_report_them.3F or http://www.gutenberg.org/wiki/Gutenberg:Readers%27_FAQ > so yes, i'm saying if an inquiry is made in public, like this, > you _should_ respond to it in public. that's transparency. That's your opinion, not mine, in this case. But to make you happy (even happier than usual), I'm responding to your inquiry publicly :) > and heck, even for the inquiries that are made in private, > you _might_ well want to make a response in public too... > > after all, that's what f.a.q. are for -- so you can answer a > question once, publicly, instead of over and over privately. > continuing to respond privately is a mistake in prioritization. > > because, honestly, if you're answering _hundreds_ of e-mails > every _week_, i don't think you're using your time in the best > way possible, and you should find a way to reduce that burden > so that you can instead pay attention to more important matters > -- the kind of things that newby-and-only-newby can do -- > instead of wearing yourself out on the less-important stuff... Moving help@ and errrata@ to the RT system was one such step. Implementing the Web-based copyright clearance was another. Creating several the various programs we use for posting new eBooks was another. Growing the WW group was another. Encouraging Charles to work on DP was another. It's quite an ongoing list of stuff that took items out of my inbox (where, for example *all* new eBooks arrived for several years) and automated, redistributed or reassigned them. I'm constantly making moves to work smarter, not harder. But because we're always doing new things, for new people, and more of it, there are always more opportunities for those newbyisms to arise. In practice, many of the things that land in my inbox COULD be addressed by other people. I'm constantly pushing things to other people...including gutvol-d. > this is constructive criticism, greg. be smart, and view it that way. Of course. People desiring to shoulder some -- any -- of the volunteer labor opportunities are always welcome to speak up. Perhaps YOU (anyone) reading this list, will choose to send a writeup on a procedure, opportunity, idea, etc. Feel encouraged. -- Greg From Bowerbird at aol.com Wed Aug 22 01:02:43 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 22 Aug 2007 04:02:43 EDT Subject: [gutvol-d] geoparsing books Message-ID: looks like the kids over at google are having fun playing with the toys: > http://radar.oreilly.com/archives/2007/08/books_in_google.html -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070822/52c6e92f/attachment.htm From Bowerbird at aol.com Wed Aug 22 01:35:03 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 22 Aug 2007 04:35:03 EDT Subject: [gutvol-d] Errata Team Message-ID: greg said: > Of course we need more help.? > People interested in working on errata can > get in touch with me, or email pgww AT lists.pglaf.org bingo. (and this fact is _not_ in the f.a.q.) > Perhaps YOU (anyone) reading this list, > will choose to send a writeup on a procedure, > opportunity, idea, etc.? Feel encouraged. thanks. myself, i am continuing to work on a methodology to resist the heavy-markup maniacs who want to sabotage the simple-yet-powerful foundation of project gutenberg, recently managing to worm their way onto the front page of the newsletter which michael hart _used_ to be the editor of... :+) -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070822/e031abc9/attachment.htm From Bowerbird at aol.com Wed Aug 22 14:32:49 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 22 Aug 2007 17:32:49 EDT Subject: [gutvol-d] maybe you want some good news Message-ID: maybe you want some good news, eh? ok, here ya go... roger frank, a distributed proofreaders volunteer, outlined a useful program yesterday that'd do a comparison of output from the first two rounds of proofing, and create individualized feedback for the people who had proofed the first round, showing changes (if any) in the second round -- i.e., for the most part, stuff they had _missed_... to get this kind of feedback before now, people had to negotiate a clumsy method to see "diffs", sad because it can be invaluable to beginners -- and often even to experienced proofers -- since it shows them holes in their performance. this is the kind of program i've always maintained would be easy to write, and extremely useful too. and i have known these things -- with certainty -- because i have _written_and_used_ apps like this... (i would've been happy to make my apps available to d.p., had people not treated me so rudely there.) here's roger's initial post, from monday, august 20th: > http://www.pgdp.net/phpBB2/viewtopic.php?p=359199#359199 here's a post roger made less than 30 hours later: > http://www.pgdp.net/phpBB2/viewtopic.php?p=359635#359635 in this post, he reports he wrote a small perl program that does the job. i'm quite sure that many d.p. people will find his program to be extremely useful, and roger is to be congratulated for creating it for the community. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070822/c5d1854c/attachment.htm From kkloos at dodo.com.au Wed Aug 22 16:56:44 2007 From: kkloos at dodo.com.au (Keith Kloosterman) Date: Thu, 23 Aug 2007 09:56:44 +1000 Subject: [gutvol-d] New Content Provider Message-ID: <46CCCD3C.50303@dodo.com.au> I am new to PG and would like some advice as a CP. Details: Own book in Dutch language of ab. 380 pages.Obtained copyright clearance. Have scanned 75 pages in PNG format. Used ABBYY Fine Reader (15 day limit and own Omipage Pro 9) to OCR these pages and proofread them. Started formatting in RTF format. Now puzzled which PG guidelines to follow. Q1. Are proofing guidelines meant for Distributed Proofreaders only? Q2. Should I download samples of scanned pages with formatted text to PG using FTP or for checking by DP? Please advise Keith Kloosterman From vze3rknp at verizon.net Wed Aug 22 18:51:43 2007 From: vze3rknp at verizon.net (Juliet Sutherland) Date: Wed, 22 Aug 2007 21:51:43 -0400 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CCCD3C.50303@dodo.com.au> References: <46CCCD3C.50303@dodo.com.au> Message-ID: <46CCE82F.40104@verizon.net> Documents that you get from the PG website apply to PG. Ones that you get from the DP website apply to DP. The PG Guidelines are aimed at individuals who want to prepare books on their own. The DP Guidelines apply to books that will pass or are currently passing through DP before getting to PG. I hope that clarifies matters a bit. JulietS Distributed Proofreaders Keith Kloosterman wrote: >I am new to PG and would like some advice as a CP. >Details: Own book in Dutch language of ab. 380 pages.Obtained copyright >clearance. Have scanned 75 pages in PNG format. Used ABBYY Fine Reader >(15 day limit and own Omipage Pro 9) to OCR these pages and proofread >them. Started formatting in RTF format. Now puzzled which PG guidelines >to follow. >Q1. Are proofing guidelines meant for Distributed Proofreaders only? >Q2. Should I download samples of scanned pages with formatted text to PG >using FTP or for checking by DP? > >Please advise >Keith Kloosterman >_______________________________________________ >gutvol-d mailing list >gutvol-d at lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > From ricardofdiogo at gmail.com Wed Aug 22 18:52:36 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Thu, 23 Aug 2007 02:52:36 +0100 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CCCD3C.50303@dodo.com.au> References: <46CCCD3C.50303@dodo.com.au> Message-ID: <9c6138c50708221852x78685954j80ec44a15265a0fb@mail.gmail.com> 2007/8/23, Keith Kloosterman : > I am new to PG and would like some advice as a CP. Welcome Keith! > Details: Own book in Dutch language of ab. 380 pages.Obtained copyright > clearance. Have scanned 75 pages in PNG format. Used ABBYY Fine Reader > (15 day limit and own Omipage Pro 9) to OCR these pages and proofread > them. Started formatting in RTF format. Now puzzled which PG guidelines > to follow. You will find all sort of frequently asked questions at http://www.gutenberg.org/wiki/Category:FAQ . The most complete ones are the Volunteer's FAQs > Q1. Are proofing guidelines meant for Distributed Proofreaders only? Yes. For instance, DP's formatting guidelines use < i > and < / i > as a convention for italic while here at PG we mostly use _underscores_ nowadays. > Q2. Should I download samples of scanned pages with formatted text to PG > using FTP or for checking by DP? > If you're asking if you can upload the raw images/scans of your paper book so that everyone can see them from PG's catalog altogether with the ebook, the answer is yes. Someone will provide you with an FTP account. Ricardo From ralf at ark.in-berlin.de Thu Aug 23 01:23:32 2007 From: ralf at ark.in-berlin.de (Ralf Stephan) Date: Thu, 23 Aug 2007 10:23:32 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <9c6138c50708221852x78685954j80ec44a15265a0fb@mail.gmail.com> References: <46CCCD3C.50303@dodo.com.au> <9c6138c50708221852x78685954j80ec44a15265a0fb@mail.gmail.com> Message-ID: <20070823082332.GC2582@ark.in-berlin.de> Ricardo wrote > 2007/8/23, Keith Kloosterman : > > Q1. Are proofing guidelines meant for Distributed Proofreaders only? > Yes. For instance, DP's formatting guidelines use < i > and < / i > as > a convention for italic while here at PG we mostly use _underscores_ > nowadays. Just to correct a slightly misleading answer. The HTML tags are later changed to underscores etc too in DP _in the text-only version_. ralf From shabam.dp at gmail.com Thu Aug 23 09:54:45 2007 From: shabam.dp at gmail.com (shabam) Date: Thu, 23 Aug 2007 09:54:45 -0700 Subject: [gutvol-d] New Content Provider In-Reply-To: <20070823082332.GC2582@ark.in-berlin.de> References: <46CCCD3C.50303@dodo.com.au> <9c6138c50708221852x78685954j80ec44a15265a0fb@mail.gmail.com> <20070823082332.GC2582@ark.in-berlin.de> Message-ID: <1ac896090708230954md07d6b8u772fe1e955415fa7@mail.gmail.com> in DP, we use in formatting because we want to be able to create HTML versions. It is easier to change and to _ than it is to change _ to or (although not impossible). If all you care about is plain text, then _underscores_ are fine. However, if you want to make it look pretty at all, HTML is a way to do that. Jason From Bowerbird at aol.com Thu Aug 23 10:07:14 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 23 Aug 2007 13:07:14 EDT Subject: [gutvol-d] New Content Provider Message-ID: one thing you should consider is to have d.p. digitize your book... all you have to do is furnish 'em the scan-set, and they'll do the rest. a downside to this is that it might take a long time to finish the job. (some books have taken years, literally. but if you ain't in a hurry...) or, if you'd like to ride herd on the task, but get help with _proofing_ -- where the more eyes you have, the less likely errors will persist -- provide the project to d.p. and then take on various responsibilities, acting as the "project manager" and "post-processor" for the project. other people will proof, and you'll manage and assemble the results. it usually doesn't take long for a book to traverse the various rounds -- the waiting is most often to get _into_ a round, and as a first-timer, i believe you'll be allowed to skip to the head of most of the queues -- so you won't be looking at a very long delay to receive this extra help. (straightforward books clear each round in a matter of days or weeks.) if you _do_ decide to go with distributed proofreaders, one thing you should constantly keep in mind is that they do a lot of things _wrong_, so you must always resist the impulse to "learn" from their procedures. most egregiously, their formatting conventions are quite abominable, so do _not_ take them as "the way it should be done". it's far better to think of it as a shining example of "the way it should _not_ be done"... likewise, their tools are customized to their workflow, so even though the _tools_ are well-done, they make you conform to a bad workflow, and it will be difficult to keep those two things separate in your head. on the other hand, if there are any "strange" things about your book, d.p. is a good place to be, because the odds are that _someone_ has already encountered something similar, and learned how to handle it. greek words? someone (perhaps lucy?) will be able to deal with 'em. hairy tables? call in the team of weirdos who _like_ to work on tables. lots of colorful illustrations that you were not able to scan very well? one of their many graphics specialists will swoop in to your rescue... thus, if you're one of those people who works better in a community, d.p. is the place for you to be. i've made a lot of noise on this listserve about how rudely they've treated me -- it's true -- but it is _also_ true that they treat almost everyone else stumbling onto the site very nicely. (the over-nice typical of cults, but if that doesn't bother you...) ;+) having said all that, though, the simple fact is that it doesn't take _that_ long to spellcheck a book's o.c.r., providing you made clean scans and got relatively decent text recognized, maybe an hour, or two or three... use a spellchecker where you can specify an "auxiliary dictionary" and make sure that you add _names_ and other novel-but-correct words to the dictionary right off so you don't have to look at 'em repeatedly. then if you read the book clean through, with a sharp eye for "stealth" scannos -- words that passed spellcheck because they are a _properly_ spelled word, except they are the _wrong_ word, like "ban" for "can" -- you will end up with a text that is clean enough to put out to the world. if you do this careful read through the book, for content and meaning, it's really unnecessary to compare each word on each page to the scan. if you want to create an .html version, you can do it rather easily using your text file; just ask me to explain "zen markup language" to do that. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070823/a6c7143d/attachment.htm From kkloos at dodo.com.au Thu Aug 23 15:18:57 2007 From: kkloos at dodo.com.au (Keith Kloosterman) Date: Fri, 24 Aug 2007 08:18:57 +1000 Subject: [gutvol-d] New Content Provider Message-ID: <46CE07D1.9010605@dodo.com.au> Thanks for the various answers received. Still sorting out the do's and don'ts. Q1.My book contains pictures which I scan and will submit when ready. Does the PG team change their format from png to HTML to enable readers to see the illustrations? Q2. The book contains an extensive index in the back referring to page numbers.I seem to have the option of keeping the page numbers or entering Chapter numbers (which don't exist at the moment, but there are section headings which I could give a Chapter number) and then referring the Index entries to the Chapters. Q3. When ready to submit the files to PG do you still want the scanned pages (in png format) in addition to the rtf files? Comments: I do have Acrobat 6 Prof and found the scanning/OCR feature not as good as your recommended ABBYY Fine Reader. I probably could supply the lot in pdf format, but am not so sure about HTML. Thanks for the responses. From ricardofdiogo at gmail.com Thu Aug 23 16:24:04 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Fri, 24 Aug 2007 00:24:04 +0100 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE07D1.9010605@dodo.com.au> References: <46CE07D1.9010605@dodo.com.au> Message-ID: <9c6138c50708231624i4a7edb27y88fbd56c8615976d@mail.gmail.com> 2007/8/23, Keith Kloosterman : > Q1.My book contains pictures which I scan and will submit when ready. > Does the PG team change their format from png to HTML to enable readers > to see the illustrations? See yourself as a member of the PG team. You'll have to do the HTML file yourself or ask some other volunteer to help you. See: http://www.gutenberg.org/wiki/Gutenberg:HTML_FAQ > Q2. The book contains an extensive index in the back referring to page > numbers.I seem to have the option of keeping the page numbers or > entering Chapter numbers (which don't exist at the moment, but there are > section headings which I could give a Chapter number) and then referring > the Index entries to the Chapters. Do chapters have a title? Maybe it's a glossary? something like: Love: 21, 34, 56, 76 Tree: 45, 67, 87, 102 If you think those references are really important you can keep the page numbers throughout the entire text. See: http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_FAQ#V.98._Should_I_keep_page_numbers_in_the_e-text.3F Or you can do something like: Love: "Mary's Life", "John's Passion" Tree: "The log", "Christmas Eve" Or you can give a number to the chapters, yes. In any case, add a Transcriber's Note explaining what you have done. In the HTML version, you just need to make internal links. > Q3. When ready to submit the files to PG do you still want the scanned > pages (in png format) in addition to the rtf files? PG will welcome the png files of every page. Remember you _have to_ submit a plain text version (*.txt). If you also want to provide a RTF file, that's OK. If you mean the individual rtf files for each page of the book, no, we don't keep those. From kkloos at dodo.com.au Thu Aug 23 16:41:05 2007 From: kkloos at dodo.com.au (Keith Kloosterman) Date: Fri, 24 Aug 2007 09:41:05 +1000 Subject: [gutvol-d] New Content Provider Message-ID: <46CE1B11.1070308@dodo.com.au> Bowerbird wrote: having said all that, though, the simple fact is that it doesn't take _that_ long to spellcheck a book's o.c.r., providing you made clean scans and got relatively decent text recognized, maybe an hour, or two or three... use a spellchecker where you can specify an "auxiliary dictionary" and make sure that you add _names_ and other novel-but-correct words to the dictionary right off so you don't have to look at 'em repeatedly. then if you read the book clean through, with a sharp eye for "stealth" scannos -- words that passed spellcheck because they are a _properly_ spelled word, except they are the _wrong_ word, like "ban" for "can" -- you will end up with a text that is clean enough to put out to the world. if you do this careful read through the book, for content and meaning, it's really unnecessary to compare each word on each page to the scan. ** spellchecking is not a problem for me. The scans are clean but a lot of text shows the spelling in the 1600's in The Netherlands and therefor not in the spellchecker. To get a true copy, those words have to be checked letter by letter.DP volunteers would need to be conversant with the Dutch language, IMHO Your comments about DP are not encouraging me much to go that path. I do have the time and at this stage would rather persevere _to ride herd on the task_ as you called it. In the mean time my questions have not been answered. From Bowerbird at aol.com Thu Aug 23 16:55:01 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 23 Aug 2007 19:55:01 EDT Subject: [gutvol-d] New Content Provider Message-ID: i don't speak for p.g., and especially not for d.p., but i'll try to help. *** keith said: > Q1. My book contains pictures which I scan and will submit when ready. > Does the PG team change their format from png to HTML > to enable readers to see the illustrations? you've made a mental slip. .html is not a picture format, like .png is. in order to combine the text and pictures for viewing in a browser, you would have to create an additional .html file from the _text_ file. and no, there's nobody at p.g. proper who would do that for you... it _is_ something that the .html team over at d.p. would do for you, just another example of how the teamwork approach can help you. if you don't want to do the .html, you can _still_ submit the pictures. either way, i recommend that you make notations in the text-file to indicate the filename of each graphic where it actually appears. > Q2. The book contains an extensive index in the back > referring to page numbers.I seem to have the option > of keeping the page numbers or entering Chapter numbers > (which don't exist at the moment, but there are section headings > which I could give a Chapter number) and then referring the > Index entries to the Chapters. it's better to keep the page-numbers than create chapter-numbers. (an invented set of numbers will do little except confuse readers, not to mention that it'd create a lot of unnecessary work for you.) if you do an .html version, the page-numbers inside the text can be maintained and made unobtrusive, and then the index can be linked. you can find many examples of this approach in the p.g. library now. and looking at existing books is a _fantastic_ way to see how to do it. for the text version, if you keep the page-numbers inside the text, they will likely be obtrusive. for this reason, in the past, people often stripped the page-numbers out of the text and from the index too. that's a little extreme, in my opinion. i would still keep the numbers in the index, even though they don't really _point_ to anything now, in case someone wants to produce the links sometime in the future. whether you keep them in the text is up to you. (it's _all_ up to you.) it _can_ be difficult to proof an index if you didn't get good o.c.r. not to sound like a broken record, but there's a _team_ of people over at d.p. who _like_ to do indexes, so you could leave it to them. this is one of the beauties of a distributed approach, that someone else probably _likes_ to do the things that you hate, and vice versa. > Q3. When ready to submit the files to PG do you still want > the scanned pages (in png format) in addition to the rtf files? yes, yes, yes! p.g. will post them if you submit them, and you should. it makes it possible for people to act on error-reports down the line. p.g. didn't encourage this in the past (and still doesn't do it enough) because scans really take up a lot of disk-space. but since storage is so cheap these days, the balance of utility has shifted to keeping them. > Comments: I do have Acrobat 6 Prof and found the scanning/OCR > feature not as good as your recommended ABBYY Fine Reader. broken-record time again: if you didn't get clean o.c.r. on your scans, it will be much more work than it should be to proof the text, so you might want to consider submitting the scan-set to d.p. to do the job, as they have a number of people who can o.c.r. the work using abbyy. and even if the results from abbyy are bad, the proofers over at d.p. have been subjected to so much crappy o.c.r., they won't complain. (it's really obscene how much _awful_ o.c.r. they've had to clean up, honestly, but they plowed through it, so they can handle it all now.) > I probably could supply the lot in pdf format, but am not so sure > about HTML. Thanks for the responses. .pdf is a terrible format in ways, and p.g. pretty much doesn't want it. *** your questions are very good, and typical of the ones _most_ people will have when embarking on the task of digitizing a book, which is why it can be good to be involved with a community of people like d.p., where people are willing to help you whenever you run into a problem. sometimes it'll make the difference between giving up and going on... i'd suggest that you trot on over to their website -- http://pgdp.net -- and sign up and proof a few pages in their system to get your feet wet. then go into their _forums_ and monitor a few threads. the experience will help you with your book even if you decide you want to do it alone. -bowerbird p.s. if you do go and proof a few pages at d.p., remember what i said about their workflow being messed up. you _don't_ have to proof text the way they do it over there, by comparing _every_word_ to the scan. if your scans are reasonably well-done, you'll get relatively clean o.c.r., meaning that a simple spellcheck of the text should do the job _fine_, provided that you do a read-through of the book, reading for content. (and if you don't have clear scans, then it's cost-efficient to redo them.) ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070823/63165603/attachment.htm From Bowerbird at aol.com Thu Aug 23 17:11:17 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 23 Aug 2007 20:11:17 EDT Subject: [gutvol-d] New Content Provider Message-ID: keith said: > spellchecking is not a problem for me. > The scans are clean but a lot of text shows > the spelling in the 1600's in The Netherlands > and therefor not in the spellchecker. ok, that is an important consideration. still, once you have added one of these old spellings to the auxiliary dictionary, the spellchecker will pass other instances. > To get a true copy, those words have to be > checked letter by letter. that's really not the case, but it certainly won't hurt if you choose to do the proofing that way. > DP volunteers would need to be conversant > with the Dutch language, IMHO well, d.p. has a number of dutch proofers, actually, including ones quite familiar with older spellings. (say hello to the nice judith_amsterdam for me!) > Your comments about DP are not encouraging me > much to go that path. I do have the time and at this stage > would rather persevere _to ride herd on the task_ > as you called it. submitting to d.p. _will_ allow you to maintain control. you just have to sign up to be the _project_manager_ and the _post-processor_ of the book, and -- as the _content-provider_ for it -- you will have top priority for those other positions. but you'd get help proofing, on the .html creation, and with the indexing and such, whenever you needed it. since this is your first book, i have to believe that their experience would be useful... of course, i also understand independence, so if you decide to go that route, peace and blessings to you... > In the mean time my questions have not been answered. do please let us know if you still have any unanswered questions... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070823/209f54de/attachment.htm From marcello at perathoner.de Thu Aug 23 17:35:34 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri, 24 Aug 2007 02:35:34 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE1B11.1070308@dodo.com.au> References: <46CE1B11.1070308@dodo.com.au> Message-ID: <46CE27D6.80709@perathoner.de> Keith Kloosterman wrote: > spellchecking is not a problem for me. The scans are clean but a lot of > text shows the spelling in the 1600's in The Netherlands and therefor > not in the spellchecker. To get a true copy, those words have to be > checked letter by letter.DP volunteers would need to be conversant with > the Dutch language, IMHO ... and a lot of them are. BTW spell checkers don't work for old texts because they often spell the same word differently through the whole text. > Your comments about DP are not encouraging me much to go that path. I do > have the time and at this stage would rather persevere _to ride herd on > the task_ as you called it. The first and most important piece of advice for new people is not to listen to what bowerbird says. This may sound arrogant, but is the sad truth. The rationale behind this advice can be found here: http://www.gnutenberg.de/bowerbird/ For answers start here: http://www.gutenberg.org/wiki/Category:FAQ eg.: http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_FAQ#V.98._Should_I_keep_page_numbers_in_the_e-text.3F -- Marcello Perathoner webmaster at gutenberg.org From grythumn at gmail.com Thu Aug 23 18:00:24 2007 From: grythumn at gmail.com (Robert Cicconetti) Date: Thu, 23 Aug 2007 21:00:24 -0400 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE1B11.1070308@dodo.com.au> References: <46CE1B11.1070308@dodo.com.au> Message-ID: <15cfa2a50708231800y1deb2fe1ve075c51f55cef788@mail.gmail.com> On 8/23/07, Keith Kloosterman wrote: > spellchecking is not a problem for me. The scans are clean but a lot of > text shows the spelling in the 1600's in The Netherlands and therefor > not in the spellchecker. To get a true copy, those words have to be > checked letter by letter.DP volunteers would need to be conversant with > the Dutch language, IMHO Just as a warning, bowerbird was kicked off of the DP forums (the only time it was ever necessary) for constant disruptive behavior. Please don't take his statements as definitive, but make up your own mind. The current DP process include 3 proofing rounds, and 2 formatting rounds. The PM can request only people fluent in Dutch work on a project; this will tend to make a project take longer to complete. Often, most of the third proofing round is done by people fluent in a language, if possible. There are several disadvantages to putting a project through DP. First, it takes time. 1-2 years from initial scanning to final posting is not uncommon. Second, we only work in Latin-1; some unusual characters are handled by combination, not the full unicode set. IIRC, this should not affect most Dutch works. There are also a number of advantages. First, it spreads the work over more people, and reduces the chance of a project getting halfway finished and then abandoned. Also, different eyes will often spot errors that were missed by the first proofer. Second, we have a lot of institutional knowledge and an active and helpful community. If you chose to postprocess the book, there are clear guidelines, and an experienced PPV will check over your book before posting, and there are tools to aid in the checks and generating the HTML. We have recently added an online spellchecker that can be easily customized to an older Dutch dialect. Third, it allows you to concentrate on the aspect of the process that you prefer; I generally stick to providing content (scanning/harvesting books, OCR, and initial preparation). There are also a number of people who prefer to work on a book by themselves, and polish it from end to end. I have vast respect for their patience. :) However, let us separate the two processes: * If you submit a book to PG, it should be in its final form, ready for the world to download. * If you submit a book to DP, we need at the least page images and high resolution scans of any illustrations, and optionally OCR saved into one text file per page. You can choose to manage the project through the rounds and postprocess it yourself, or you can leave it to others. As to your other questions.. Although they'll publish nearly any format, PG vastly prefers having books in a format that they can correct. A PDF, by itself, is very difficult to correct, unless you include the files it was generated from, and the tools to correct it are available. For example, LaTeX or TEI books can automatically generate PDFs. Also consider that a PDF is very difficult to read on a cell phone or PDA. > Thanks for the various answers received. Still sorting out the do's and > don'ts. > Q1.My book contains pictures which I scan and will submit when ready. > Does the PG team change their format from png to HTML to enable readers > to see the illustrations? No, they post the book as provided. I think at one point they may have tried to produce a .TXT file if one was not provided, but I do not know whether that is the case now. > Q2. The book contains an extensive index in the back referring to page > numbers.I seem to have the option of keeping the page numbers or > entering Chapter numbers (which don't exist at the moment, but there are > section headings which I could give a Chapter number) and then referring > the Index entries to the Chapters. Generally, DP works omit the page numbers in the index for the .TXT file version, but include page anchors in the HTML editions. Word and Acrobat provide similar functionality, but it's harder to do it automatically. > Q3. When ready to submit the files to PG do you still want the scanned > pages (in png format) in addition to the rtf files? There are separate, fairly new, guidelines for submitting page images. If you chose to use DP, you don't have to worry about it. http://www.gutenberg.org/wiki/Gutenberg:Scanning_FAQ#S.21._Will_PG_store_scanned_page_images_of_my_book.3F R C From Bowerbird at aol.com Thu Aug 23 23:09:40 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 24 Aug 2007 02:09:40 EDT Subject: [gutvol-d] New Content Provider Message-ID: robert said: > bowerbird was kicked off of the DP forums > (the only time it was ever necessary) > for constant disruptive behavior. yeah, right. and bush went to war against iraq because they had weapons of mass destruction. it's necessary to look past the convenient excuse. and why do you feel a need to do negative spin? if you disagree with anything i said -- anything -- then present your position and your reasoning... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070824/05ea29e8/attachment.htm From gbnewby at pglaf.org Fri Aug 24 10:52:14 2007 From: gbnewby at pglaf.org (Greg Newby) Date: Fri, 24 Aug 2007 10:52:14 -0700 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE07D1.9010605@dodo.com.au> References: <46CE07D1.9010605@dodo.com.au> Message-ID: <20070824175214.GA30400@mail.pglaf.org> Hi, Keith, and welcome. You got a few different answers...here are mine. The FAQ is indeed a good place to start (www.gutenberg.org). As others have mentioned, Distributed Proofreaders has a tighter set of guidelines which you might enjoy reading about. In fact, it might be appealing for you to proof at least a few pages there, and read some of their formatting advice, since it will show you some of the ways that things are done. BTW, if you didn't get a copyright clearance yet, start at http://copy.pglaf.org More: On Fri, Aug 24, 2007 at 08:18:57AM +1000, Keith Kloosterman wrote: > Thanks for the various answers received. Still sorting out the do's and > don'ts. > Q1.My book contains pictures which I scan and will submit when ready. > Does the PG team change their format from png to HTML to enable readers > to see the illustrations? PNG, GIF or JPG are fine as formats. There's no team but you to do the markup etc. :) At least, unless you ask for help on gutvol-d or elsewhere. The FAQ has lots of guidelines on image sizes & formats, and how to embed. > Q2. The book contains an extensive index in the back referring to page > numbers.I seem to have the option of keeping the page numbers or > entering Chapter numbers (which don't exist at the moment, but there are > section headings which I could give a Chapter number) and then referring > the Index entries to the Chapters. You can use named anchor tags, to jump straight to the page. Jumping to the top of the chapter is also viable. Some people do keep the page numbers in their eBooks, despite the eBooks being significantly reformatted, reflowed, etc. It's fine to do so. Personally, I think it's of less interest for most fiction than for some reference books and other non-fiction, since the page numbers give an anchor for citations, as well as for comparisons to versions of the book found elsewhere. > Q3. When ready to submit the files to PG do you still want the scanned > pages (in png format) in addition to the rtf files? We do like to get page images. There more on that in the FAQ. As to RTF: it's more typical to get a .txt and .htm, without an .rtf. But all three (or combinations) are also OK. IF you're asking whether we want your intermediate OCR results: no, that's not something we typically try to collect. > Comments: I do have Acrobat 6 Prof and found the scanning/OCR feature > not as good as your recommended ABBYY Fine Reader. I probably could > supply the lot in pdf format, but am not so sure about HTML. > Thanks for the responses. We're less interested in PDF as a stored file, since we can't easily fix it or make derivatives. -- Greg From Bowerbird at aol.com Fri Aug 24 11:39:10 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Fri, 24 Aug 2007 14:39:10 EDT Subject: [gutvol-d] online presence leads to increased hard-copy borrowing Message-ID: in an exchange over at o'riley: > http://radar.oreilly.com/archives/2007/08/the_google_exch.html#comments gary charbonneau said: > I understand that there is some anecdotal evidence > from the original Google libraries that > books are more likely to be used in their physical manifestation > after digitization than they were before. interesting. of course, it mirrors the reports that putting a book online _increases_the_print_sales_. simply, an on-line presence stimulates offline use. it provokes one to say that putting books online might be a way to _save_libraries_ from oblivion. but then, of course, we realize that that's obvious. because if books don't go online, which we agree is where _all_ the action will happen in the future, then they will _most_certainly_ die a quick death... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070824/81e61afe/attachment.htm From piggy at netronome.com Mon Aug 27 11:42:54 2007 From: piggy at netronome.com (La Monte Henry Piggy Yarroll) Date: Mon, 27 Aug 2007 14:42:54 -0400 Subject: [gutvol-d] People Behind PG In-Reply-To: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> Message-ID: <46D31B2E.2050404@netronome.com> Ricardo F Diogo wrote: > I'm trying to write a more complete "People Behind PG" page for the > wiki. Please edit my user page in case you have more accurate info: > http://www.gutenberg.org/wiki/User:Ricdiogo/behind > > Ricardo > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > I see this is now here: http://www.gutenberg.org/wiki/The_People_Behind_Project_Gutenberg One name I was expecting to see but do not is Juliet Sutherland. From shabam.dp at gmail.com Mon Aug 27 12:31:38 2007 From: shabam.dp at gmail.com (shabam) Date: Mon, 27 Aug 2007 12:31:38 -0700 Subject: [gutvol-d] People Behind PG In-Reply-To: <46D31B2E.2050404@netronome.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> Message-ID: <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> She is under copyright clearance team. DP is separate from PG, so aside from her work with the copyright clearance team how would you have her listed? Jason On 8/27/07, La Monte Henry Piggy Yarroll wrote: > Ricardo F Diogo wrote: > > I'm trying to write a more complete "People Behind PG" page for the > > wiki. Please edit my user page in case you have more accurate info: > > http://www.gutenberg.org/wiki/User:Ricdiogo/behind > > > > Ricardo > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d at lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > I see this is now here: > http://www.gutenberg.org/wiki/The_People_Behind_Project_Gutenberg > > One name I was expecting to see but do not is Juliet Sutherland. > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From ricardofdiogo at gmail.com Mon Aug 27 12:37:05 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Mon, 27 Aug 2007 20:37:05 +0100 Subject: [gutvol-d] People Behind PG In-Reply-To: <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> Message-ID: <9c6138c50708271237p3ff0c0b6t6957dc36282ea615@mail.gmail.com> Juliet is now under copyright clearance team. And I've also created a special section for Distributed Proofreaders. Ricardo From ricardofdiogo at gmail.com Mon Aug 27 12:48:47 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Mon, 27 Aug 2007 20:48:47 +0100 Subject: [gutvol-d] People Behind PG In-Reply-To: <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> Message-ID: <9c6138c50708271248p12e28e1fla40dfc2ccfe0592@mail.gmail.com> 2007/8/27, shabam : > DP is separate from PG, so > aside from her work with the copyright clearance team how would you > have her listed? > > Jason > Same reasoning prevented me from adding DPers to that page in the first place. But, God, was I wrong! DPers are _DEFINITIVELY_ people behind PG. They are our main core. Ricardo From shabam.dp at gmail.com Mon Aug 27 12:51:36 2007 From: shabam.dp at gmail.com (shabam) Date: Mon, 27 Aug 2007 12:51:36 -0700 Subject: [gutvol-d] People Behind PG In-Reply-To: <9c6138c50708271248p12e28e1fla40dfc2ccfe0592@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> <9c6138c50708271248p12e28e1fla40dfc2ccfe0592@mail.gmail.com> Message-ID: <1ac896090708271251w444ec4c1q6bd7f6cc87a76011@mail.gmail.com> I agree.... I was just assuming that you were keeping it strictly to PG. If you are adding DP, are you going to add DP-EU, etc? Jason On 8/27/07, Ricardo F Diogo wrote: > 2007/8/27, shabam : > > DP is separate from PG, so > > aside from her work with the copyright clearance team how would you > > have her listed? > > > > Jason > > > Same reasoning prevented me from adding DPers to that page in the > first place. But, God, was I wrong! DPers are _DEFINITIVELY_ people > behind PG. They are our main core. > > Ricardo > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From ricardofdiogo at gmail.com Mon Aug 27 13:15:09 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Mon, 27 Aug 2007 21:15:09 +0100 Subject: [gutvol-d] People Behind PG In-Reply-To: <1ac896090708271251w444ec4c1q6bd7f6cc87a76011@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> <9c6138c50708271248p12e28e1fla40dfc2ccfe0592@mail.gmail.com> <1ac896090708271251w444ec4c1q6bd7f6cc87a76011@mail.gmail.com> Message-ID: <9c6138c50708271315l77afaa1s862f425791923e48@mail.gmail.com> 2007/8/27, shabam : > I agree.... I was just assuming that you were keeping it strictly to > PG. If you are adding DP, are you going to add DP-EU, etc? > > Jason > Yes, I think. DP-E also directly provides ebooks for PG. Ricardo From walter.van.holst at xs4all.nl Mon Aug 27 16:13:00 2007 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Tue, 28 Aug 2007 01:13:00 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE1B11.1070308@dodo.com.au> References: <46CE1B11.1070308@dodo.com.au> Message-ID: <46D35A7C.8090605@xs4all.nl> Keith Kloosterman wrote: > spellchecking is not a problem for me. The scans are clean but a lot of > text shows the spelling in the 1600's in The Netherlands and therefor > not in the spellchecker. To get a true copy, those words have to be > checked letter by letter.DP volunteers would need to be conversant with > the Dutch language, IMHO Drop by with the band of Dutch volunteers on DP, several books from the 1600s have already passed through their hands and they have experience with the typographic oddities of books from that era, such as the long s that actually denotes a t. As far as Bowerbird's advice is concerned: most valid qualifications of him require four-letter words. Or to paraphrase Douglas Adams: "mostly useless". Regards, Walter From Bowerbird at aol.com Mon Aug 27 16:37:49 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Mon, 27 Aug 2007 19:37:49 EDT Subject: [gutvol-d] New Content Provider Message-ID: walter said: > As far as Bowerbird's advice is concerned: > most valid qualifications of him require four-letter words. > Or to paraphrase Douglas Adams: "mostly useless". this coming from people who pat themselves on the back endlessly about how "friendly" they are. it's quite amusing. :+) meanwhile, you see another demonstration about how they are incapable of disputing the accuracy of _what_ i say, but nonetheless try to impugn the reputation of the messenger. i joke -- somewhat -- when i refer to d.p. as "a cult", but anyone who's had dealings with scientology will recognize the patterns of behavior exhibited... (and, for the record, i get along quite well with some of the people from d.p., just as i get along quite well with some scientologists...) i must say that the d.p. people missed a good opportunity. they could have offered to proof keith's book for him, with no obligation. wouldn't have been the first time a book has been proofed without being uploaded -- officially -- by d.p. the good-faith gesture might have converted an outsider... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070827/b09af75f/attachment.htm From robert_marquardt at gmx.de Mon Aug 27 22:19:47 2007 From: robert_marquardt at gmx.de (Robert Marquardt) Date: Tue, 28 Aug 2007 07:19:47 +0200 Subject: [gutvol-d] People Behind PG In-Reply-To: <9c6138c50708271237p3ff0c0b6t6957dc36282ea615@mail.gmail.com> References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> <9c6138c50708271237p3ff0c0b6t6957dc36282ea615@mail.gmail.com> Message-ID: On Mon, 27 Aug 2007 20:37:05 +0100, you wrote: >Juliet is now under copyright clearance team. And I've also created a >special section for Distributed Proofreaders. Do not forget to add her to the Wiki team also. -- Robert Marquardt (Team JEDI) http://delphi-jedi.org From Bowerbird at aol.com Tue Aug 28 10:41:50 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 28 Aug 2007 13:41:50 EDT Subject: [gutvol-d] showdown Message-ID: here's an extremely impressive implementation of _markdown_, one of the light-markup systems: > http://www.attacklab.net/showdown-gui.html one line of code on a webpage turns a regular textarea field into wysiwyg, with the added benefit that the .html that is created is semantic in structure. another line of .html shows a preview. markdown isn't _quite_ zen enough -- not for me, anyway -- but _you_ might become quite enamored by it. if distributed proofreaders were to use this, they could eliminate the formatting rounds, and improve their efficiency _remarkably_... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070828/e333a49c/attachment.htm From jeroen.mailinglist at bohol.ph Tue Aug 28 12:20:48 2007 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Tue, 28 Aug 2007 21:20:48 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <46CE07D1.9010605@dodo.com.au> References: <46CE07D1.9010605@dodo.com.au> Message-ID: <46D47590.2070700@bohol.ph> Keith Kloosterman wrote: > Thanks for the various answers received. Still sorting out the do's and > don'ts. > Q1.My book contains pictures which I scan and will submit when ready. > Does the PG team change their format from png to HTML to enable readers > to see the illustrations? > The PG whitewashers (people who post texts to PG) do pretty little with the HTML you submit. PNGs used as illustrations will be kept. They check for compliance with a few rules, and bounce it back to you when it does not follow these. They will slam a header and trailer on it. > Q2. The book contains an extensive index in the back referring to page > numbers.I seem to have the option of keeping the page numbers or > entering Chapter numbers (which don't exist at the moment, but there are > section headings which I could give a Chapter number) and then referring > the Index entries to the Chapters. > You can study a few existing HTML versions, and note that they place the original page numbers in the margin, where they also double as HTML anchors to make index entries link to. > Q3. When ready to submit the files to PG do you still want the scanned > pages (in png format) in addition to the rtf files? > > The png format pages will be welcome. If you need any help, you can contact me, I've handled a good deal of the Dutch books in PG. Jeroen Hellingman. > > From donovan at abs.net Tue Aug 28 15:06:45 2007 From: donovan at abs.net (D Garcia) Date: Tue, 28 Aug 2007 18:06:45 -0400 Subject: [gutvol-d] showdown In-Reply-To: References: Message-ID: <200708281806.45837.donovan@abs.net> On Tuesday 28 August 2007 13:41, Bowerbird at aol.com wrote: > one line of code on a webpage turns > a regular textarea field into wysiwyg, One line of code on a webpage which requires (admittedly free) sign up, which links in substantially more than one line of code from their server, and makes anyone using it dependent on their server(s) staying up, and their code not ever being broken by an update. You also get to hope they never stop supporting or offering it. From lee at novomail.net Tue Aug 28 16:33:59 2007 From: lee at novomail.net (Lee Passey) Date: Tue, 28 Aug 2007 17:33:59 -0600 Subject: [gutvol-d] New Content Provider In-Reply-To: <46D35A7C.8090605@xs4all.nl> References: <46CE1B11.1070308@dodo.com.au> <46D35A7C.8090605@xs4all.nl> Message-ID: <46D4B0E7.8080809@novomail.net> Walter van Holst wrote: [snip] > As far as Bowerbird's advice is concerned: most valid qualifications of > him require four-letter words. Or to paraphrase Douglas Adams: "mostly > useless". Bowerbird can be socially inept, technologically naive, and unreasonably strident, but he is /not/ stupid. I found the advice he offered mostly in line with my own experience, at least to the extent that it does not involve working with PG and DP (with which I have little or no experience). To summarize my experience: 1. Use ABBYY FineReader version 8. Using a product other than FineReader will simply cause you more headache than it is worth. 2. Spell-check the result using the FineReader spell check program. It's not as good as some, but it does allow you to create user-defined dictionaries, and it will highlight the supposed misspelling in the context of the actual scan. 3. DO NOT save the file as text or PDF. Save as HTML or RTF, depending on which format you are most comfortable with. 4. Smooth read the final result for comprehension. I think you will find that scannos will virtually leap of the page at you. Trying to compare the final result with the scanned image in an "image against image" kind of way will lead to too many missed errors. The human mind needs context. At this point you will have a product better than almost anything PG has produced in its early years. You can now submit to PG as a final product, or to DP for final tweaking. From Bowerbird at aol.com Tue Aug 28 17:15:27 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 28 Aug 2007 20:15:27 EDT Subject: [gutvol-d] showdown Message-ID: donovan said: > makes anyone using it dependent wrong. there are a number of open-source implementations -- in multiple languages -- of the markdown parser. even the source for this java-based one is available... (the one-line method is there for non-programmers.) *** moreover, all of this is still beside the point, because even an elementary-school perl coder (like me) can program a routine to translate light-markup to .html. and i say this with certainty, since i've already done it: > http://z-m-l.com/go/vl3.pl *** and yet, distributed proofreaders continues to use a byzantine pseudo-markup, necessitating extra work, as it requires formatting be separated from proofing. that's why there's been a drop in the number of books finished monthly by d.p. the past few years (in spite of the increased number of volunteers that are now being attracted by a banner on the p.g. site), to the point that you no longer list the monthly totals on your front page, like you used to do... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070828/81b2dff3/attachment.htm From ricardofdiogo at gmail.com Tue Aug 28 17:20:04 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 29 Aug 2007 01:20:04 +0100 Subject: [gutvol-d] People Behind PG In-Reply-To: References: <9c6138c50708211147o17eca5a3g7083d636a833f7ef@mail.gmail.com> <46D31B2E.2050404@netronome.com> <1ac896090708271231t59cdce5fge4ee52c4ddb4d68d@mail.gmail.com> <9c6138c50708271237p3ff0c0b6t6957dc36282ea615@mail.gmail.com> Message-ID: <9c6138c50708281720g2fe20e3dw3fedfb9e5854e7dc@mail.gmail.com> 2007/8/28, Robert Marquardt : > On Mon, 27 Aug 2007 20:37:05 +0100, you wrote: > > >Juliet is now under copyright clearance team. And I've also created a > >special section for Distributed Proofreaders. > > Do not forget to add her to the Wiki team also. > -- > Robert Marquardt (Team JEDI) http://delphi-jedi.org Done. Ricardo From donovan at abs.net Tue Aug 28 18:47:53 2007 From: donovan at abs.net (D Garcia) Date: Tue, 28 Aug 2007 21:47:53 -0400 Subject: [gutvol-d] showdown In-Reply-To: References: Message-ID: <200708282147.54053.donovan@abs.net> On Tuesday 28 August 2007 20:15, Bowerbird at aol.com wrote: > and yet, distributed proofreaders continues to use a > byzantine pseudo-markup, necessitating extra work, > as it requires formatting be separated from proofing. DP markup is considerably more familiar to most people than Markdown, and no more byzantine or pseudo- than it, wiki, or phpbb markup. The separation of formatting and proofreading has a cost in speed, but improves accuracy for both areas. The choice of markup does not increase the work required; it has to be done regardless of where, when or how it is added. > that's why there's been a drop in the number of books > finished monthly by d.p. the past few years (in spite of > the increased number of volunteers that are now being > attracted by a banner on the p.g. site), to the point that > you no longer list the monthly totals on your front page, > like you used to do... The current and previous month totals are always in the logo banner on the front page for all visitors. You're referring to the past 12 totals-by-month view, which non-signed in visitors no longer see. That and other changes were made so that visitors are presented with clear, concise, uncluttered information about the purpose of the site and how they can contribute. From Bowerbird at aol.com Tue Aug 28 19:09:31 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Tue, 28 Aug 2007 22:09:31 EDT Subject: [gutvol-d] New Content Provider Message-ID: lee said: > Bowerbird can be socially inept, > technologically naive, and > unreasonably strident now tell us what you _really_ feel, lee... ;+) -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070828/3b41ae67/attachment.htm From kkloos at dodo.com.au Tue Aug 28 20:25:56 2007 From: kkloos at dodo.com.au (Keith Kloosterman) Date: Wed, 29 Aug 2007 13:25:56 +1000 Subject: [gutvol-d] Problems being a late entrant Message-ID: <46D4E744.9020203@dodo.com.au> My original intention to download a processed book with PG has been altered after receiving feedback. I have now made offerings to DP on 25 Aug without reply. We are all volunteers and Juliet is apparently a very busy person. I still have the problem of reading DP FAQs and contradictory comments made in this forum. After spending weeks making RTF files, they have now been changed to TXT files. Is this not correct for DP? Now I am ready for downloading to DP: scanned pages in png format. Also,those pages have been proofread after OCR/Finereader and are in txt format. Is that all the requirements for DP to arrange more proofreading? Keith From walter.van.holst at xs4all.nl Wed Aug 29 01:31:31 2007 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Wed, 29 Aug 2007 10:31:31 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <46D4B0E7.8080809@novomail.net> References: <46CE1B11.1070308@dodo.com.au> <46D35A7C.8090605@xs4all.nl> <46D4B0E7.8080809@novomail.net> Message-ID: <46D52EE3.6090306@xs4all.nl> >> As far as Bowerbird's advice is concerned: most valid qualifications of >> him require four-letter words. Or to paraphrase Douglas Adams: "mostly >> useless". > > Bowerbird can be socially inept, technologically naive, and unreasonably > strident, but he is /not/ stupid. I found the advice he offered mostly > in line with my own experience, at least to the extent that it does not > involve working with PG and DP (with which I have little or no > experience). Hm, maybe I went a little overboard, caused by being thoroughly fed-up by his rantings other people's inefficiency and his glorious ZML. Not to mention the occassional flame wars spilling over from Teleread to here. After having fished up some of his advice from my killfilter, I have to admit that it makes unusual sense given the source. Back to lurking it is then (and trying to do more proofreading at DP). Regards, Walter From Bowerbird at aol.com Wed Aug 29 02:16:01 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 29 Aug 2007 05:16:01 EDT Subject: [gutvol-d] showdown Message-ID: donovan said: > DP markup is considerably more familiar to most people than Markdown really? /* hands up out there among the subscribers here: who knows what this markup means? */ /# and once again? how about this? #/ or this? > and no more byzantine or pseudo- than it, wiki, or phpbb markup. all those other forms are byzantine too, i agree. z.m.l., though, is clean. again, i refer you to the z.m.l. versions you can view from this page: > http://z-m-l.com/go/vl3.pl do you see any byzantine markup in any of those files? i think not. > The separation of formatting and proofreading has a cost in speed, > but improves accuracy for both areas. it might "improve accuracy" compared to doing them both together, using byzantine markup. but compared to a light-markup method, your 5 rounds can't even _begin_ to compare, in speed or accuracy. you put a whole bunch of clunky markup in, and then you take it out. how is the world can you try to convince someone that is _efficient_? even more silly is the fact that you now routinely strip out the styling that the o.c.r. recognized -- so your proofers aren't "distracted" -- and then your formatters _reintroduce_ that styling. unbelievable! (and lest some observers here think that text styling is rare in books, well, sometimes it is. but on the other hand, "the american language" by h.l. mencken, has some 9,293 italicized words in it. not rare at all.) > The choice of markup does not increase the work required; > it has to be done regardless of where, when or how it is added. heavy markup certainly does increase the work required, without doubt. and when you don't have a quick wysiwyg capacity, that's even more so. the easier it is to add markup and see it's correct, the faster the progress. and when that markup is unobtrusive, it will not interfere with proofing, which will mean that you can dispense with separate formatting rounds, which will cut the total number of rounds in half, thus saving much time. the primary reason you had to increase the number of rounds you do is because you were increasingly making .html versions, and that markup became so obtrusive that it seriously interfered with the task of proofing. once dp-canada is up and running, with its light-markup methodology and instant wysiwyg previewing, your own volunteers will tell you this... > The current and previous month totals are always in the logo banner right. and you used to include the last _12_ months, not just the last _2_, current and previous. that was my point. thanks for repeating what i said. > You're referring to the past 12 totals-by-month view, > which non-signed in visitors no longer see. you're repeating what i said again. there's really no need to do that. (but of course i understand _why_ you're doing it, because the careless reader -- and there are many here -- will then get the impression that you are _correcting_ what i said. even when it works, it's a cheap trick.) > That and other changes were made so that visitors are > presented with clear, concise, uncluttered information > about the purpose of the site and how they can contribute. when your monthly totals were impressive, and climbing steadily, they were part of the "clear, concise, uncluttered information" that you wanted to share with the world. but since they're permanently stalled at a relatively constant number, now suddenly you "decide" they are _not_ something you want to share. what a coincidence... *** but let's take this whole discussion up a meta-notch, why don't we? i don't really care what system d.p. uses. i'm just going with the flow of the discussions here on this listserve. i recently advised keith that he should drop in on d.p. to see how you do things, but also told him _not_ to adopt d.p. methodology fully -- not if he was intending on doing the digitization by himself -- because d.p. does some things in a way that's suboptimal. so now i'm here filling in some of the details, specifying one of the things about your workflow -- your formatting, using heavy markup -- that i believe can clearly be seen as suboptimal. i am offering a better alternative -- light markup -- that keith can use. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070829/b5b3a57e/attachment-0001.htm From Bowerbird at aol.com Wed Aug 29 02:41:32 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 29 Aug 2007 05:41:32 EDT Subject: [gutvol-d] New Content Provider Message-ID: walter said: > Hm, maybe I went a little overboard, nah, you were just being colorful. i found it amusing. :+) > Hm, maybe I went a little overboard, > caused by being thoroughly fed-up by his rantings "rantings?" i put no emotionality in my messages. they're carefully written, as demonstrated by the formatting. if you experience them as "rantings", it's because you are _injecting_ that into the posts. i _do_ craft my messages as _rorschach_blots_. what you see in them reflects yourself. i do that because i want to learn whatever i can about you. > Hm, maybe I went a little overboard, > caused by being thoroughly fed-up by his rantings > other people's inefficiency and his glorious ZML. > Not to mention the occassional flame wars > spilling over from Teleread to here. again, the discussions of _efficiency_ are because i've done a lot of research to determine a "best-practices" workflow... we've got millions upon millions of books we need to digitize, so it makes sense to figure out the best way to go about it... it's not "ranting". mostly, it's just a bunch of common sense, which i am quite willing to talk about in a rational discussion. and, in support of that rational discourse, i usually have data, which i have collected, that bears directly on the issues at hand. my research has also convinced me of the vital importance of _tools_ that facilitate the workflow. (who can argue with that?) and, so you know, i _started_ talking about the d.p. inefficiency because i _love_ the volunteers there who are doing this big job, and i wanted to _help_ them to do it with as little work as possible. it was only when i was treated so badly there that i decided that i didn't care whether they were wasting their time via inefficiency. i still love 'em for the job they're doing, but if they want to bang their head against the wall by using a bad workflow, i don't care. as for "flame-wars" that "spill over" from teleread, i am always very careful to explain exactly why i feel my posts are relevant. and i'm always willing to explain further if you're still wondering. but again, they aren't "flame-wars". after all, this is just e-mail. it's very highly civilized, don't you think? i don't force it on you. you are free to ignore it totally, with a touch of the delete button. and hey, if it gives you an upset stomach or high blood pressure, i highly encourage you to go ahead and push that delete button, without even reading the thing. my messages are clearly marked as coming from "bowerbird at aol.com", so you _are__ forewarned. and even if someone else quotes a bit of my post, it's always very clear that it has come from me, because of my all-lower-casing. but if you _do_ choose to read my posts, and you find them to be "emotional" in any way, then take responsibility for the fact that it is _you_ who injected that aspect into the messages, not me... ok? -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070829/ceddc486/attachment.htm From schultzk at uni-trier.de Wed Aug 29 03:19:41 2007 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Wed, 29 Aug 2007 12:19:41 +0200 Subject: [gutvol-d] showdown In-Reply-To: References: Message-ID: <6CBC2B8C-0FCC-455E-997B-4CF5C75460A2@uni-trier.de> Am 29.08.2007 um 11:16 schrieb Bowerbird at aol.com: > donovan said: > > DP markup is considerably more familiar to most people than > Markdown > > really? > > /* > hands up out there among the subscribers here: > who knows what this markup means? > */ I would guess a comment > > /# > and once again? > how about this? > #/ Here again a comment > > or this? I would gues SMALL CAPITALS How did I score? O.K. not a fair test ! I am a programmer, linguist and LaTeX user. So, markup is not unfimiliar to me. Besides, I had done some reading at the DP-site a while back and noticed their mixture or HTML and LaTeX styles. But, as is with any markup you do need to know the syntax and semantics. As to the rest I will leave it uncommented as we have all been there and back! ;-) I am getting an itch to do something, I actually never wanted to bother with: 1) Develope a grammar for the PG formatting standards 2) Write a parser for it 3) Add code for putting markup in the text automatically 4) Add so-called intelligence to perfect the markup With this system all you need is the proofer for the scans. Heh, Bowerbird I am lazy: what is the ratio between the DP produced texts and texts comming from the NORMAL sources? regards Keith. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070829/45c8240d/attachment.htm From ricardofdiogo at gmail.com Wed Aug 29 07:18:17 2007 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed, 29 Aug 2007 15:18:17 +0100 Subject: [gutvol-d] Problems being a late entrant In-Reply-To: <46D4E744.9020203@dodo.com.au> References: <46D4E744.9020203@dodo.com.au> Message-ID: <9c6138c50708290718q96452dfsd72f5e1133bc1a65@mail.gmail.com> 2007/8/29, Keith Kloosterman : > I have now made offerings to DP on 25 > Aug without reply. We are all volunteers and Juliet is apparently a very > busy person. > She is. If she doesn't answer you that's because she _really really_ can't. She loves helping people. > I still have the problem of reading DP FAQs and contradictory comments > made in this forum. > After spending weeks making RTF files, they have now been changed to TXT > files. Is this not correct for DP? > Forget the comments made in this forum. The only DP FAQ you have to ready is the Content Provider's. Here: http://www.pgdp.net/c/faq/cp.php In that FAQ you'll see a section about Guiprep -- forget it for now. Yes. DP needs *.txt files for any page of the book, along with the correspondent *.png so that people can compare the scan with the OCR output. > Now I am ready for downloading to DP: scanned pages in png format. > Also,those pages have been proofread after OCR/Finereader and are in txt > format. Is that all the requirements for DP to arrange more proofreading? > Keith Yes. In order to upload to DP you will need a zipped folder. Eg: The Netherlands.zip containing all the pages of the book: 000.txt 000.png 001.txt 001.png 002.txt 002.png If you want, I can take care of uploading your files. Please send me an email if you want so. Ricardo From Bowerbird at aol.com Wed Aug 29 08:52:25 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Wed, 29 Aug 2007 11:52:25 EDT Subject: [gutvol-d] showdown Message-ID: keith said: > How did I score? 1 for 3. not bad at all. the other two indicate different types of "rewrap" instructions: > http://www.pgdp.net/wiki/PPTools/Guiguts/Rewrapping#Rewrap_Markers > I am getting an itch to do something, > I actually never wanted to bother with: > 1) Develope a grammar for the PG formatting standards > 2) Write a parser for it > 3) Add code for putting markup in the text automatically > 4) Add so-called intelligence to perfect the markup? > With this system all you need is the proofer for the scans.? sounds like the things that i've said i've already done. just add to it an error-reporting system that involves folding the e-texts into a wiki-like structure, one that positions the text next to the scans (when possible), letting people _suggest_ any necessary changes, which are then _accepted_or_rejected_ by the administrators, and you'll have yourself a rather-complete workflow... > Heh, Bowerbird I am lazy: what is the ratio between > the DP produced texts and? texts comming from > the NORMAL sources? what "normal" sources? there are none, not really. there are a handful of independent producers that feed a steady trickle of books into project gutenberg, while d.p. produces a constant stream of 180/month. addition of things like audio recordings from libre vox is what enables p.g. to keep up a still-impressive march. of course, "still-impressive" is a relative thing, what with google scanning more books every day -- before lunch -- than d.p. digitizes in a year. luckily, for the purposes of _reading_ only, a scan-set serves the objective quite well. (even more luckily, google has _appartenly_ improved its workflow to the point where there's a 50-50 chance that a book they've just scanned is _completely_ readable -- i.e., every page was scanned, and every scan is legible. still, because it has taken them some time even to reach _this_ "plateau" of quality-control, one is strongly advised to examine the entire scan-set before one starts reading, so you can avoid being surprised by the fact that page 123 is absent-without-leave upon your breathless arrival there.) there are many other scanning projects around these days too. none of them even come close to the quantity that google scans, and -- with the possible exception of o.c.a. -- none of them are significantly better on quality either. so, it's basically _google_. and google is basically a crap-shoot. so unless they can pull a rabbit out of their labs, the quality of the future cyberlibrary will be up to the valiant efforts of individuals like you and me... -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070829/def6a332/attachment.htm From marcello at perathoner.de Wed Aug 29 09:12:00 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed, 29 Aug 2007 18:12:00 +0200 Subject: [gutvol-d] New Content Provider In-Reply-To: <46D4B0E7.8080809@novomail.net> References: <46CE1B11.1070308@dodo.com.au> <46D35A7C.8090605@xs4all.nl> <46D4B0E7.8080809@novomail.net> Message-ID: <46D59AD0.70303@perathoner.de> Lee Passey wrote: > Bowerbird can be socially inept, technologically naive, and unreasonably > strident, but he is /not/ stupid. I found the advice he offered mostly > in line with my own experience, Bowerbird is like one of those scales that tell your your weight and your percentage of body fat. And people think: "Hey, the weight is correct, so the body fat percentage must be correct too." He is intelligent enough to cut and paste FAQ entries, and could be a valuable community member if he could restrain himself to that, but unfortunately he must mix in his own imbecile ideas (and his many personal problems). Novices cannot tell what part is sound advice and what part is bowershit, so the best advice to give novices is to simply ignore him. -- Marcello Perathoner webmaster at gutenberg.org From shabam.dp at gmail.com Wed Aug 29 08:53:46 2007 From: shabam.dp at gmail.com (shabam) Date: Wed, 29 Aug 2007 08:53:46 -0700 Subject: [gutvol-d] Problems being a late entrant In-Reply-To: <9c6138c50708290718q96452dfsd72f5e1133bc1a65@mail.gmail.com> References: <46D4E744.9020203@dodo.com.au> <9c6138c50708290718q96452dfsd72f5e1133bc1a65@mail.gmail.com> Message-ID: <1ac896090708290853m3b1cbfffv4987e7a6f8ecd4b2@mail.gmail.com> Keith, I will backchannel you, and I will help you to get your project onto DP. I will help anyone else who wants to put a project through DP. If you want to just provide the project and let it go, I can take it and run, or give it to someone else, if I don't have the time. If you want to manage the project yourself, then I can mentor anyone wanting to be a new PM. It does take time though. If you sent a message to Juliet on the 25th, that is only 4 days ago, and she may be out of town, or taking care of her kids. She will get back to you when she has a chance, but it can sometimes take her a week or more. DP is not the only thing she does, and like the rest of us, she is an unpaid volunteer and has to work this in around the rest of her life. Jason On 8/29/07, Ricardo F Diogo wrote: > 2007/8/29, Keith Kloosterman : > > I have now made offerings to DP on 25 > > Aug without reply. We are all volunteers and Juliet is apparently a very > > busy person. > > > She is. If she doesn't answer you that's because she _really really_ > can't. She loves helping people. > > > I still have the problem of reading DP FAQs and contradictory comments > > made in this forum. > > After spending weeks making RTF files, they have now been changed to TXT > > files. Is this not correct for DP? > > > Forget the comments made in this forum. The only DP FAQ you have to > ready is the Content Provider's. Here: > http://www.pgdp.net/c/faq/cp.php > In that FAQ you'll see a section about Guiprep -- forget it for now. > > Yes. DP needs *.txt files for any page of the book, along with the > correspondent *.png so that people can compare the scan with the OCR > output. > > > Now I am ready for downloading to DP: scanned pages in png format. > > Also,those pages have been proofread after OCR/Finereader and are in txt > > format. Is that all the requirements for DP to arrange more proofreading? > > Keith > > Yes. In order to upload to DP you will need a zipped folder. Eg: The > Netherlands.zip containing all the pages of the book: > 000.txt > 000.png > 001.txt > 001.png > 002.txt > 002.png > > If you want, I can take care of uploading your files. Please send me > an email if you want so. > > Ricardo > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From kkloos at dodo.com.au Wed Aug 29 18:30:04 2007 From: kkloos at dodo.com.au (Keith Kloosterman) Date: Thu, 30 Aug 2007 11:30:04 +1000 Subject: [gutvol-d] Problems being a late entrant Message-ID: <46D61D9C.4000305@dodo.com.au> Just to set the record straight, Juliet has replied to my email. Ofcourse I accept, like I entered in my email, that we are all volunteers and some of us have many other commitments. I was just getting a bit frustrated and started to wonder to persevere with PG or not. As suggested by Juliet, I have contacted Jeroen, who had previously already offered me his help. I will be patient this time. Thanks Bowerbird and shabam for your offers of help. I do not wish to get involved in any personality clashes at PG Cheers, Keith Kloosterman (not to be confused with the other Keith) From schultzk at uni-trier.de Thu Aug 30 01:59:45 2007 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Thu, 30 Aug 2007 10:59:45 +0200 Subject: [gutvol-d] showdown In-Reply-To: References: Message-ID: Hi Bowerbird and everybody else, 1 out of three !! That is terrible. I am even shock! Am 29.08.2007 um 17:52 schrieb Bowerbird at aol.com: > keith said: > > How did I score? > > 1 for 3. not bad at all. > > the other two indicate different types of "rewrap" instructions: > > http://www.pgdp.net/wiki/PPTools/Guiguts/Rewrapping#Rewrap_Markers Excuse me, rewrapping markers ? Just read the page. Oh, my gosh!!!! Somebody, who knows nothing about textprocessing and markup wrote these markup tags. They should have given them telling/speaking names as verbose, poetry, verse etc. Yet, I must admit. What has been developed was a huge task and I DO GIVE CREDIT. At least I know now that I will not touch DP or get involved. Simply not designed properly. Oh, well. Not my problem. > > > > I am getting an itch to do something, > > I actually never wanted to bother with: > > 1) Develope a grammar for the PG formatting standards > > 2) Write a parser for it > > 3) Add code for putting markup in the text automatically > > 4) Add so-called intelligence to perfect the markup > > With this system all you need is the proofer for the scans. > > sounds like the things that i've said i've already done. Bowerbird, would you mind giving me a look at your source code? If I get anything off the ground and use some of it I will definately give you credit. We can exchange information directly and not necessarily here. regards Keith. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070830/d6ab974b/attachment.htm From shabam.dp at gmail.com Thu Aug 30 08:14:44 2007 From: shabam.dp at gmail.com (shabam) Date: Thu, 30 Aug 2007 08:14:44 -0700 Subject: [gutvol-d] Problems being a late entrant In-Reply-To: <46D61D9C.4000305@dodo.com.au> References: <46D61D9C.4000305@dodo.com.au> Message-ID: <1ac896090708300814n6fa3d033n58e37ebe27ea4421@mail.gmail.com> Keith, Jeroen is a great person to help you, especially since he knows the language you are doing. In fact, if I was to help you, I would probably ask him to assist, as he knows the language, and I don't. :) Many of us wish to avoid personality clashes. I think that they keep some people from participating in this forum (and PG as a whole). They kept me from participating in the forum for a long time. Then I figured out that I didn't have to read every message. :D Jason On 8/29/07, Keith Kloosterman wrote: > Just to set the record straight, Juliet has replied to my email. > Ofcourse I accept, like I entered in my email, that we are all > volunteers and some of us have many other commitments. I was just > getting a bit frustrated and started to wonder to persevere with PG or not. > > As suggested by Juliet, I have contacted Jeroen, who had previously > already offered me his help. I will be patient this time. > > Thanks Bowerbird and shabam for your offers of help. > I do not wish to get involved in any personality clashes at PG > > Cheers, > Keith Kloosterman (not to be confused with the other Keith) > _______________________________________________ > gutvol-d mailing list > gutvol-d at lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From shabam.dp at gmail.com Thu Aug 30 08:56:01 2007 From: shabam.dp at gmail.com (shabam) Date: Thu, 30 Aug 2007 08:56:01 -0700 Subject: [gutvol-d] showdown In-Reply-To: References: Message-ID: <1ac896090708300856p9db36d4t31fbbac9cdc589e2@mail.gmail.com> > Excuse me, rewrapping markers ? > Just read the page. Oh, my gosh!!!! Somebody, who knows nothing about > textprocessing and markup wrote these markup tags. They should have given > them > telling/speaking names as verbose, poetry, verse etc. Yes, a lot of people have complained about them not having good names. I too see them as comment tags, but not everyone has experience in programming. Also, a lot of markup codes use equally un-understandable code. Ever use regex code? Some people have suggesting using verbose tags like
instead of /# #/ and or the html
 instead of /* */

But for those that do not use the buttons in the formatting interface,
these are much harder to type.  Every time we have a discussion about
changing the guidelines, those two come up.  Some people do not want
to change, others are willing to change, but only if something better
comes along, others think we should change it now, and each group is
very vocal about it.  We never come to a consensus about what to
change it to, so it does not get changed.

That is an area where you have it better, doing it yourself.  If you
don't like how something is being done, you are the only one you have
to convince to change it.  For DP, we have a lot of volunteers.  While
Juliet could just change the policy, could you imagine the uproar that
would cause?  We need to get a large group of people to agree on what
code to use before she will agree to change it.

We have done it.  We did away with the 5 star thoughtbreak markup, and
replaced it with .  This took a lot of discussion a couple votes
and then some more discussion before it happened.  Much easier for the
formatters.  It does however need to be changed to the 5 stars for the
text version, and to 
for the HTML version. I'm not sure if it is easier for Post Processing or not, but it doesn't make it much more difficult. And there was a general consensus about not liking the 5 stars and that we needed to change it to make formatting easier. The only people who said anything about keeping it where those that said "I just hit a button, and it works fine, I don't care if it changes." With the /##/ and /**/ markup, not everyone thinks they are that bad (although most people think that certain aspects of them need to change.) There are also a lot of vocal people that think we should move to a TEI markup, doing away completely with the markup we have now. I do think we are moving this way, but it will not happen over night. Jason From Bowerbird at aol.com Thu Aug 30 09:20:02 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 30 Aug 2007 12:20:02 EDT Subject: [gutvol-d] showdown Message-ID: keith said: > Bowerbird, would you mind giving me a look at your source code? my source isn't available. but as i said, there are plenty of open-source implementations that convert markdown into .html. since those were written by people skilled in the scripting languages in which they're done, they'd probably be more useful to you anyway. as for my perl, i'm in elementary school -- second grade now, soon to be third. :+) -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070830/0008eec0/attachment.htm From Bowerbird at aol.com Thu Aug 30 09:28:57 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 30 Aug 2007 12:28:57 EDT Subject: [gutvol-d] Problems being a late entrant Message-ID: the other keith said: > I do not wish to get involved in any personality clashes at PG me neither. :+) unfortunately, some people use conflict as a tactic to embroil with controversy everything i might say, so others will simply stop listening to me. it's another one of those cheap tricks that -- sadly -- works often enough that they continue to employ it. of course, in the process, they shred their own credibility, so i guess there is some justice in the world, after all. a smidgen. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070830/f4e27956/attachment.htm From Bowerbird at aol.com Thu Aug 30 09:43:12 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 30 Aug 2007 12:43:12 EDT Subject: [gutvol-d] showdown Message-ID: as another example for the other keith... the "rewrap markup" discussion is one of those cases where buying in to the d.p. workflow -- even just a little bit -- will often warp your mentality in ways that you might not realize. in my research on the entire range of e-book development, i have found increasingly that rewrapping the text is _not_ something that should be done prematurely in digitization. indeed, for some purposes, it is far more desirable to have the original linebreaks -- even with end-line hyphenation -- retained. these purposes include end-user "final proofing" and mimicking of the original paper-book (for comparison, references that already exist through the archival literature, and for reprinting in a way that can mirror the source-text.) since it's easy enough to have the machine rewrap the lines -- which is essentially how it is achieved even over at d.p. -- there's no reason it cannot be done by end-users, providing a few precautions are taken to ensure the results are accurate. so the question is not "what kind of markup should we use to indicate the specific type of rewrap?", but rather the more basic "should we even be _doing_ rewrap?" -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070830/eeefcb72/attachment.htm From Bowerbird at aol.com Thu Aug 30 16:03:46 2007 From: Bowerbird at aol.com (Bowerbird at aol.com) Date: Thu, 30 Aug 2007 19:03:46 EDT Subject: [gutvol-d] the first and most important piece of advice Message-ID: so, i was visiting my spam folder, to see if something i was anticipating had gotten misdirected there, so i decided to see what marcello and lee had to say, just to see if i should release them from the doghouse yet... well, lee's post was entertaining, in a way, so i responded yesterday. (but no, he's still in the doghouse.) marcello, on 8/23, though, couldn't even manage to be entertaining: > The first and most important piece of advice for new people > is not to listen to what bowerbird says. sometimes i feel kinda queasy when i state -- straight out, like i did earlier today -- that "some people use conflict as a tactic to embroil with controversy everything i might say, so others will simply stop listening to me." i mean, really, it sounds so _accusatory_ that it startles me, even though i know it to be the case. if there was importance attached, i might even start to wonder if i was being overly paranoid. so in a way, it's good to have marcello say it himself, out loud. because now everyone knows it's true. his agenda is _clear_... if he wasn't such an ass, i would probably pity poor marcello. imagine being so insecure in yourself that the "first and most important piece of advice" you give "new people" is that they should ignore your critics. what a sad and pathetic existence. i say this: listen closely to my critics. examine what they say with a fine-tooth comb and carefully weigh their arguments. (if you can find any, that is.) then make your own decisions... oh yeah, and in regard to marcello in particular, if you find _any_ shreds of logic in his posts, please do share them by translating them into your own words and reposting them. because otherwise i will not see them, because i'm going to continue to route his posts -- unread -- to my spam folder. -bowerbird ************************************** Get a sneak peek of the all-new AOL at http://discover.aol.com/memed/aolcom30tour -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070830/9b75b98d/attachment.htm From marcello at perathoner.de Thu Aug 30 21:52:23 2007 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri, 31 Aug 2007 06:52:23 +0200 Subject: [gutvol-d] the first and most important piece of advice In-Reply-To: References: Message-ID: <46D79E87.3090102@perathoner.de> Bowerbird at aol.com wrote: > so, i was visiting my spam folder We know that the most interesting place in your pc is the spam folder but we still don't want a weekly report of its contents. -- Marcello Perathoner webmaster at gutenberg.org From schultzk at uni-trier.de Fri Aug 31 01:41:22 2007 From: schultzk at uni-trier.de (Schultz Keith J.) Date: Fri, 31 Aug 2007 10:41:22 +0200 Subject: [gutvol-d] showdown In-Reply-To: <1ac896090708300856p9db36d4t31fbbac9cdc589e2@mail.gmail.com> References: <1ac896090708300856p9db36d4t31fbbac9cdc589e2@mail.gmail.com> Message-ID: Hi, Far as the names are concerned you have two paths open: 1) leave them cryptic; its a feature, changing it causes just confusion 2) develop decent names, add them, yet are synonymous and trigger the same code make the others obsolete exchange the tags in older files after some time remove the cryptic These tags are not comments either, at the most pseudo-code or pragma. In my opinion they are simple markup and should be treated so. regards Keith. Am 30.08.2007 um 17:56 schrieb shabam: >> Excuse me, rewrapping markers ? >> Just read the page. Oh, my gosh!!!! Somebody, who knows nothing >> about >> textprocessing and markup wrote these markup tags. They should >> have given >> them >> telling/speaking names as verbose, poetry, verse etc. > > > Yes, a lot of people have complained about them not having good names. > I too see them as comment tags, but not everyone has experience in > programming. Also, a lot of markup codes use equally > un-understandable code. Ever use regex code? Some people have > suggesting using verbose tags like > >
instead of /# #/ > and > or the html
 instead of /* */

[for bevity : snip, snip]

From schultzk at uni-trier.de  Fri Aug 31 02:20:10 2007
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Fri, 31 Aug 2007 11:20:10 +0200
Subject: [gutvol-d] showdown
In-Reply-To: 
References: 
Message-ID: <9981AF1A-64ED-4962-BDE6-2874AE7997BC@uni-trier.de>

Hi,

	at this point I would like to reiterate for EVERYBODY.

	We are taking about markup, aka formating.
	When a display/printed version of the text is made,  it is
	restructured for output.  Rewrapping is the process of
	changing the line length of an OUTPUT !!!

	Markup does NOT rewrap it formats! Also, rewrapping
	generally only effects so-called soft-line-breaks not the
	hard ones!!

	It is up to the design of the markup langauge encode
	original look and how this is intened to be used during
	the production of the final outputted text or ignored.

	I hope this clears the matter up. If not as Zadeh says:
	"We are still confused, but on a higher level".

	regards
		Keith.

	

	
Am 30.08.2007 um 18:43 schrieb Bowerbird at aol.com:

> as another example for the other keith...
>
> the "rewrap markup" discussion is one of those cases where
> buying in to the d.p. workflow -- even just a little bit -- will
> often warp your mentality in ways that you might not realize.
>
> in my research on the entire range of e-book development,
> i have found increasingly that rewrapping the text is _not_
> something that should be done prematurely in digitization.
>
> indeed, for some purposes, it is far more desirable to have
> the original linebreaks -- even with end-line hyphenation --
> retained.  these purposes include end-user "final proofing"
> and mimicking of the original paper-book (for comparison,
> references that already exist through the archival literature,
> and for reprinting in a way that can mirror the source-text.)
>
> since it's easy enough to have the machine rewrap the lines
> -- which is essentially how it is achieved even over at d.p. --
> there's no reason it cannot be done by end-users, providing
> a few precautions are taken to ensure the results are accurate.
>
> so the question is not "what kind of markup should we use to
> indicate the specific type of rewrap?", but rather the more basic
> "should we even be _doing_ rewrap?"
>
> -bowerbird
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070831/3d57ca73/attachment.htm 

From schultzk at uni-trier.de  Fri Aug 31 02:36:34 2007
From: schultzk at uni-trier.de (Schultz Keith J.)
Date: Fri, 31 Aug 2007 11:36:34 +0200
Subject: [gutvol-d] the first and most important piece of advice
In-Reply-To: 
References: 
Message-ID: <4424E4DF-110D-4726-BAE8-473B58AF7896@uni-trier.de>

Oh, well,

	The kindergarden is opened again.

Am 31.08.2007 um 01:03 schrieb Bowerbird at aol.com:

[snip, snip]

> imagine being so insecure in yourself that the "first and most
> important piece of advice" you give "new people" is that they
> should ignore your critics.  what a sad and pathetic existence.
	I would not say insecure, but confident not one is definately
	not correct. Aka: " do not listen to my critics or you will start  
looking
	at my deficiencies of argument". Sad, true. Pathetic, No. Do not forget
	This kind of behaviour is common of those that can not handle critic
	whether justified or not. Actually, find find this kind of behaviour  
all to
	common.

> i say this: listen closely to my critics.  examine what they say
> with a fine-tooth comb and carefully weigh their arguments.
> (if you can find any, that is.)  then make your own decisions...
	I could not agree more. Except there are always arguments. It is
	a matter of quality.
>
> oh yeah, and in regard to marcello in particular, if you find
> _any_ shreds of logic in his posts, please do share them by
> translating them into your own words and reposting them.
> because otherwise i will not see them, because i'm going to
> continue to route his posts -- unread -- to my spam folder.
	Savik to Spock(about Kirk): "He is so illogical"
	Spock (Answers) : "He is human!"
	
	regards
		Keith.





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070831/689ca9c2/attachment.htm 

From Bowerbird at aol.com  Fri Aug 31 09:19:30 2007
From: Bowerbird at aol.com (Bowerbird at aol.com)
Date: Fri, 31 Aug 2007 12:19:30 EDT
Subject: [gutvol-d] showdown
Message-ID: 

keith said:
>    I hope this clears the matter up. If not as Zadeh says:
>    "We are still confused, but on a higher level".

i don't know who else might be confused, but
i can be clear about what _i_ am talking about.

i'm talking about the plain-text file -- "pgascii".

the text from the o.c.r. is routinely dehyphenated
during proofing, and then rewrapped afterward...

this constitutes a loss of information, namely
the data concerning the original line-breaks.

more accurately, it's a deliberate deletion of data.

(which, coincidentally, is relatively hard to restore,
which makes the ease with which it is discarded
especially ironic and tragic.   i don't know how to
put this any more delicately:   for some purposes,
this deliberate discarding of original linebreaks
makes the digitized text _absolutely_worthless_.
not for _all_ purposes, mind you, but for some.
and if your digitization creates _useless_results_,
then i really think you need to do reconsideration.)


>    We are taking about markup, aka formating.
>    When a display/printed version of the text is made,
>    it is restructured for output.? Rewrapping is the 
>    process of? changing the line length of an OUTPUT !!!

distributed proofreaders considers this text to be
its "output", while   i consider it to be my "input", so
neither term seems to me to be usefully descriptive.


>    Markup does NOT rewrap it formats! 

i'm not sure what you mean here.   the markup will often
dictate the specific nature of rewrapping when it's done.


>    Also, rewrapping generally only effects so-called 
>    soft-line-breaks not the hard ones!!?

in a pgascii e-text, there is no distinction between
"soft" and "hard" linebreaks.   you might think, at first,
a single linebreak is a "soft" one, while two (or more)
constitutes a "hard" one.   but that's not true in a table,
or a multi-line salutation or signature block.   and that's
another problem with the ascii e-texts, that there is this
ambiguity about the meaning of each specific linebreak.
it's one of those cases where the human reader usually
can resolve the ambiguity by knowledge of the content,
but that doesn't help our "poor dumb" computers much.
so the human digitizer should be injecting into the data
(i.e., the text) a "clue" that will help the computer decide.

in z.m.l., the "clue" that a line is _not_ to be rewrapped is
to put one or more spaces at the front of the line.   simple.
(this gets interpreted like a .html "br" tag at the end of the
_previous_ line _as_well_as_ the end of the _current_ one.)


>    It is up to the design of the markup langauge encode
>    original look and how this is intened to be used during
>    the production of the final outputted text or ignored.

i'm not entirely sure, but i think i agree with that.       :+)

***

keith said:
>   The kindergarden is opened again.

nah, not really.   marcello's back in the spam folder again,
where he belongs.   i can't be the only one who noticed that
when i ceased responding to the asses, they largely stopped
posting, so there hasn't been a protracted flaming in a while.

i guess that old bromide about not feeding the trolls is true...

-bowerbird



**************************************
 Get a sneak peek of the all-new AOL at 
http://discover.aol.com/memed/aolcom30tour
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070831/9f9acbdb/attachment-0001.htm 

From rolsch at verizon.net  Fri Aug 31 14:42:53 2007
From: rolsch at verizon.net (Roland Schlenker)
Date: Fri, 31 Aug 2007 17:42:53 -0400
Subject: [gutvol-d] showdown
In-Reply-To: 
References: 
Message-ID: <200708311742.53695.rolsch@verizon.net>

> this constitutes a loss of information, namely
> the data concerning the original line-breaks.
> more accurately, it's a deliberate deletion of data.

I must disagree with you on this point, that there is no data loss.

Line-breaks in a paragraph accrue because the text is filled to a printed 
page.  It has no relationship in most cases to the author intent or to the 
ability of the reader to understand the author's intent.

As such:

This is a sentence.

and:

This
is
a
sentence.

are both understood by the reader.

Roland

From Bowerbird at aol.com  Fri Aug 31 16:16:10 2007
From: Bowerbird at aol.com (Bowerbird at aol.com)
Date: Fri, 31 Aug 2007 19:16:10 EDT
Subject: [gutvol-d] showdown
Message-ID: 

roland-

right now it's 4:20 on friday.

i'll talk more about linebreaks after the three-day weekend...              
;+)

-bowerbird



**************************************
 Get a sneak peek of the all-new AOL at 
http://discover.aol.com/memed/aolcom30tour
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20070831/b40228f8/attachment.htm 

From piggy at netronome.com  Fri Aug 31 18:36:11 2007
From: piggy at netronome.com (La Monte Henry Piggy Yarroll)
Date: Fri, 31 Aug 2007 21:36:11 -0400
Subject: [gutvol-d] showdown
In-Reply-To: <200708311742.53695.rolsch@verizon.net>
References: 
	<200708311742.53695.rolsch@verizon.net>
Message-ID: <46D8C20B.1070106@netronome.com>

Roland Schlenker wrote:
>> this constitutes a loss of information, namely
>> the data concerning the original line-breaks.
>> more accurately, it's a deliberate deletion of data.
>>     
>
> I must disagree with you on this point, that there is no data loss.
>
> Line-breaks in a paragraph accrue because the text is filled to a printed 
> page.  It has no relationship in most cases to the author intent or to the 
> ability of the reader to understand the author's intent.
>
> As such:
>
> This is a sentence.
>
> and:
>
> This
> is
> a
> sentence.
>
> are both understood by the reader.
>   
unless your
name
is
e.e. cummings


From rolsch at verizon.net  Fri Aug 31 22:43:18 2007
From: rolsch at verizon.net (Roland Schlenker)
Date: Sat, 01 Sep 2007 01:43:18 -0400
Subject: [gutvol-d] showdown
In-Reply-To: <46D8C20B.1070106@netronome.com>
References: 
	<200708311742.53695.rolsch@verizon.net>
	<46D8C20B.1070106@netronome.com>
Message-ID: <200709010143.18478.rolsch@verizon.net>

On Friday 31 August 2007 9:36 pm, La Monte Henry Piggy Yarroll wrote:
> Roland Schlenker wrote:
> >> this constitutes a loss of information, namely
> >> the data concerning the original line-breaks.
> >> more accurately, it's a deliberate deletion of data.
> >
> > I must disagree with you on this point, that there is no data loss.
> >
> > Line-breaks in a paragraph accrue because the text is filled to a printed
> > page.  It has no relationship in most cases to the author intent or to
> > the ability of the reader to understand the author's intent.
> >
> > As such:
> >
> > This is a sentence.
> >
> > and:
> >
> > This
> > is
> > a
> > sentence.
> >
> > are both understood by the reader.
>
> unless your
> name
> is
> e.e. cummings

I had to look up e. e. cummings because, I couldn't figure out what you meant.  
He seems to have been quite a character.

I will agree an e-text of an e. e. cummings' work would require line-breaks to 
be maintained correctly.  Since, the author's intent requires them to be so.

Roland