From nihil_obstat at mindspring.com Sat Jan 1 06:02:01 2005 From: nihil_obstat at mindspring.com (Dennis McCarthy) Date: Sat Jan 1 06:02:05 2005 Subject: [gutvol-d] !@!Googleberg eBooks Message-ID: <9779928.1104588121534.JavaMail.root@wamui10.slb.atl.earthlink.net> Of particular concern to PG voluteers will be the clarity of the page scans of Google Print's public domain works, which will mainly come from the academic libraries' rare books archives. As far as I am concerned this is the best potential of Google Print--to make works available that 99.999% of the population never had access to. (How important, really, is it to look at just a few pages of a book that is in most public libraries and many book stores?) I doubt those libraries will allow Google to cut up the books--and rightly so--therefore the quality of the images may not be as good. Although Google makes it difficult to download these pages images, we all know that where there is a will, there is a way. And perhaps some PG volunteers will use these page scans for a real e-book. Better scans would make for easier transcriptions. >From the Google Print side, worse scan would probably cause more errors in their behind-the-scenes OCR database linked to each page scan--making searches of these pages less accurate. Hopefully for researches, this increase in error rate will be a fraction of a percent, but who knows? -----Original Message----- From: juliet.sutherland@verizon.net Sent: Dec 31, 2004 11:09 PM To: Project Gutenberg Volunteer Discussion Subject: Re: Re: [gutvol-d] !@!Googleberg eBooks > 3. Google cut the pages ('cos the scans are just _beautiful_!) and scan > the pages of the books into images. As I've previously noted, destructive scanning of modern reprints is easy and usually results in good images and good OCR. --------------------------- Dennis McCarthy nihil_obstat@mindspring.com From shimmin at uiuc.edu Sat Jan 1 10:07:06 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Sat Jan 1 10:07:19 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <9779928.1104588121534.JavaMail.root@wamui10.slb.atl.earthlink.net> References: <9779928.1104588121534.JavaMail.root@wamui10.slb.atl.earthlink.net> Message-ID: <41D6E6CA.5070004@uiuc.edu> > From the Google Print side, worse scan would probably cause more errors > in their behind-the-scenes OCR database linked to each page scan--making searches > of these pages less accurate. Hopefully for researches, this increase in error > rate will be a fraction of a percent, but who knows? Once upon a time, I had access to JSTOR, and frequently browsed through their scans of old (as in 17th-18th century) Philofophical Tranfactions. From the glimpses (mostly from the text surrounding my search terms) I got of the underlying OCR text, I came to the conclusion that even error-ridden OCR is good enough to return keyword search results of non-embarrassing calibre. And I can well imagine that some sort of fuzzy term matching to compensate for the most common known scanno themes could be employed to make raw OCR very suitable for keyword searches. -- RS From jtinsley at pobox.com Sat Jan 1 11:06:34 2005 From: jtinsley at pobox.com (Jim Tinsley) Date: Sat Jan 1 11:06:42 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <20050101040925.FZKW17379.out008.verizon.net@outgoing.verizon.net> References: <20050101040925.FZKW17379.out008.verizon.net@outgoing.verizon.net> Message-ID: <20050101190634.GA24497@panix.com> On Fri, 31 Dec 2004 20:09:25 -0800, wrote: >Here's an interesting experiment... > >Go to http://www.google.com/googleblog/. > >Under "All booked up" (which talks about the Google/Library project), click on the link labelled "the survival of the fittest". This takes you to a beta of Google Print, for the specific book "Darwin, and After Darwin". > >Under "Search within this book", type "Darwin" and hit "Go". You'll get a new window with 3 images, showing the first few occurrences of "Darwin" in the book, where "Darwin" is highlighted in yellow. > >What's interesting is that in the third image, there are two occurrences of the word "Darwin", but the first is not highlighted. > >Similarly, if you search for "Berkeley", one occurrence in the second image is missing its highlight. > >This suggests that their searches are based on unproofed OCR results (where the unhighlighted occurrences correspond to uncorrected scannos). > >... searching for "1 arwin" (one, space, arwin) and having it highlight "Darwin". (Try it, it's neat!) >--------------- > Thanks for the cite, Juliet! I didn't know about that thread. I read it, and the main thing that struck me was that bowerbird found the OCRed text, because it sure wasn't in the HTML sent back to me using Mozilla. Hm. Could they be tailoring their pages depending on User-Agent: or the Accept: line in the headers sent by the browser? The answer is yes. When I search for "1 arwin" using Lynx, or Mozilla with images turned off (must be turned off before you start your initial Google Search), I get text instead of the images, like: Darwin, and After Darwin Pages 1 - 1 of 1 in book for 1 arwin. (0.03 seconds) Page i 1)ARWIN, AND AFTER darwin A1V ?xfositfoiv OF TIFF DARWINIAN TIFEOR V AND A discussion OF POST-DARWZNL4N QUES7IONS BY THE LATE GEORGE JOHN ROMANES , MA, LL. ... This is obviously the text they're searching. Unfortunately, the whole text of a page is not similarly displayed when I do a page view. Interestingly, both "I arwin" and "1 arwin" (capital "I" and digit "1") find the same passage. It seems that somebody in Google Print has decided to tweak its search to be tolerant of at least some common OCR errors. jim From rjholder at altelco.net Sun Jan 2 16:27:58 2005 From: rjholder at altelco.net (Ronald Holder) Date: Sun Jan 2 16:27:51 2005 Subject: [gutenberg] Re: [gutvol-d] Re: barriers to XML posting References: <1e2.2cc70258.2ea9861d@aol.com> <41783BAB.2020901@perathoner.de><200410220906.i9M96rMS019592@posso.dm.unipi.it> <4178E52E.2030107@adelaide.edu.au> Message-ID: <000f01c4f12b$13f35b60$2d01a8c0@ATHLON> ----- Original Message ----- From: "Steve Thomas" To: "Project Gutenberg Volunteer Discussion" Sent: Friday, October 22, 2004 5:47 AM Subject: [gutenberg] Re: [gutvol-d] Re: barriers to XML posting >A question (possibly better put over on the DP list): > > Is it possible to OCR a scan directly to XML? Or is the output > from OCR always going to be text? Abbyy Finereader 7.0 has the capability of saving each page of OCR as Microsoft Word XML format. I have not experimented with it, and am not even knowlegable about XML yet, but if at some point PGDP wanted to use XML as a source format, it could be done, if the project manager has this software to work with. Abbyy 7 can also output its OCR as HTML, Excel spreadsheet, and many other formats. Ronald Holder PGDP volunteer From bruce at zuhause.org Sun Jan 2 19:09:36 2005 From: bruce at zuhause.org (Bruce Albrecht) Date: Sun Jan 2 19:09:52 2005 Subject: [gutenberg] Re: [gutvol-d] Re: barriers to XML posting In-Reply-To: <000f01c4f12b$13f35b60$2d01a8c0@ATHLON> References: <1e2.2cc70258.2ea9861d@aol.com> <41783BAB.2020901@perathoner.de> <200410220906.i9M96rMS019592@posso.dm.unipi.it> <4178E52E.2030107@adelaide.edu.au> <000f01c4f12b$13f35b60$2d01a8c0@ATHLON> Message-ID: <16856.46960.858515.296536@celery.zuhause.org> It's nice that it can output it in some sort of XML, but I wouldn't want it in Microsoft Word XML format. Microsoft file formats are proprietary and have a tendency to change at every release. Ronald Holder writes: > Abbyy Finereader 7.0 has the capability of saving each page of OCR as > Microsoft Word XML format. I have not experimented with it, and am > not even knowlegable about XML yet, but if at some point PGDP wanted to > use XML as a source format, it could be done, if the project manager > has this software to work with. > Abbyy 7 can also output its OCR as HTML, Excel spreadsheet, > and many other formats. From shalesller at writeme.com Sun Jan 2 19:46:22 2005 From: shalesller at writeme.com (D. Starner) Date: Sun Jan 2 19:46:37 2005 Subject: [gutenberg] Re: [gutvol-d] Re: barriers to XML posting Message-ID: <20050103034622.B4CFA4BDAB@ws1-1.us4.outblaze.com> "Bruce Albrecht" writes: > It's nice that it can output it in some sort of XML, but I wouldn't > want it in Microsoft Word XML format. Microsoft file formats are > proprietary and have a tendency to change at every release. We don't have to worry about Microsoft, we have to worry about what the OCR program puts out. Which will most likely be a subset of the stable, well-documented part of Word XML. In any case, XML is just a storage format. This discussion is much like asking if the OCR program will output zip files; what matters is what's in the files. There's no reason to switch from the current use of RTF unless there's more information available through another format, be it an XML based one or not. We're going to have to switch it into our formats, anyway. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From nihil_obstat at mindspring.com Mon Jan 3 07:52:17 2005 From: nihil_obstat at mindspring.com (Dennis McCarthy) Date: Mon Jan 3 07:52:23 2005 Subject: [gutvol-d] Google Search/PG Header Suggestion Message-ID: <22389068.1104767537436.JavaMail.root@wamui01.slb.atl.earthlink.net> A thought just occured to me concerning Google searches and PG texts... It was mentioned in a previous thread that Google has set up a way to prioritize a search, such that "book" plus your term will give added priority to your search with its Google Print database. One searches Google Print through its normal search engine. The Print results (if any) come up first, followed by external internet site results. The current PG header does not actually have the word "book" in it. The term "ebook" is used a couple times. If the header could be rewritten to include the word "book" somewhere, then PG texts may come up higher in search results, when browers are searching for on-line texts by title or author. Obviously the PG book would not come up as a Google Print result, but may be more visible in the other results. --------------------------- Dennis McCarthy nihil_obstat@mindspring.com From hart at pglaf.org Mon Jan 3 07:57:42 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 3 07:57:44 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <9779928.1104588121534.JavaMail.root@wamui10.slb.atl.earthlink.net> References: <9779928.1104588121534.JavaMail.root@wamui10.slb.atl.earthlink.net> Message-ID: On Sat, 1 Jan 2005, Dennis McCarthy wrote: > > Of particular concern to PG voluteers will be the clarity of the page scans > of Google Print's public domain works, which will mainly come from the > academic libraries' rare books archives. Yes, one concern we all have is how good Googleberg's scans will be. Will they give us access to the best hi-res scans? Or only to something that is easy on their storage and bandwidth, and consquently not so good for OCR? [I'm guessing they will NOT make the best materials available to all. Either in the case of raw scans, or the OCRed full text files.] > As far as I am concerned this is the best potential of Google Print--to make > works available that 99.999% of the population never had access to. Of course, this raises the question if 99.999% of the population WANTS access to these books. . .a question I raised earlier. . .will the Googleberg collection be so stilted that it is mostly for scholars? > (How important, really, is it to look at just a few pages of a book that is > in most public libraries and many book stores?) Well. . .this brings us to the entire point of why have Project Gutenberg? Why give people an entire home library of eBooks that are "in most public libraries and many book stores?" * I say the answer is simply individual access rather than public access. Of course, Ray Bradbury VIOLENTLY disagrees with me here, and I understand why he does. . .he believes in the social experience of libraries. > I doubt those libraries will allow Google to cut up the books--and rightly > so--therefore the quality of the images may not be as good. Although Google > makes it difficult to download these pages images, we all know that where > there is a will, there is a way. And perhaps some PG volunteers will use > these page scans for a real e-book. Better scans would make for easier > transcriptions. I'm betting Googleberg will store the hi-res scans offline, hidden behind some VERY powerful security. As for cutting up the books, some are, some aren't. Michael From nihil_obstat at mindspring.com Mon Jan 3 08:08:15 2005 From: nihil_obstat at mindspring.com (Dennis McCarthy) Date: Mon Jan 3 08:08:21 2005 Subject: [gutvol-d] !@!Googleberg eBooks Message-ID: <23279690.1104768495951.JavaMail.root@wamui01.slb.atl.earthlink.net> Clarification: My point below was that _a few pages_ of a common work is not particularly useful in the grand scheme of things. (A few pages is all Google Print would offer for most books). An _entire text_ everyone has individual access to is a different matter entirely, and why many of us contribute to PG. -----Original Message----- From: Michael Hart Sent: Jan 3, 2005 10:57 AM To: Project Gutenberg Volunteer Discussion Subject: Re: Re: [gutvol-d] !@!Googleberg eBooks > (How important, really, is it to look at just a few pages of a book that is > in most public libraries and many book stores?) Well. . .this brings us to the entire point of why have Project Gutenberg? Why give people an entire home library of eBooks that are "in most public libraries and many book stores?" * I say the answer is simply individual access rather than public access. Of course, Ray Bradbury VIOLENTLY disagrees with me here, and I understand why he does. . .he believes in the social experience of libraries. --------------------------- Dennis McCarthy nihil_obstat@mindspring.com From hart at pglaf.org Mon Jan 3 08:26:09 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 3 08:26:09 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <142187819718.20041231122527@noring.name> References: <20041230212536.30DEE4BDAB@ws1-1.us4.outblaze.com> <142187819718.20041231122527@noring.name> Message-ID: On Fri, 31 Dec 2004, Jon Noring wrote: > Michael Hart wrote: > >> In addition, I should add the pretty much ALL the original PG eBooks >> came from multiple editions, simply to do better error checking. > > How many of the PG texts fall into the category "the original PG > eBooks"? Who knows? > There is, of course, a difference between consulting other sources to > clarify a few things with the text derived from the primary source, and > simply kludging together a bunch of different editions to form a "new > edition". Of cource, this get's into a scholarly world I've tried to avoid all these years, as per the suggestions of my father, who was a great Shakespeare professor. We don't want to get into such scholarly arguments as how to punctuate "To be, or note to be." Obviously, any scholar will be able to figure out which editions we have used without much effort, and those who are not scholars won't care which editions we used because they don't care if it is: "To be or not to be." "To be, or not to be." "To be; or not to be." or "To be: or not to be." To them, that is not the question, and a discussion of that question would shuffle them off this mortal coil into the land of dreams. > An example of how things got out of whack with the "original PG texts" > is Mary Shelley's "Frankenstein", where there are two quite different > editions, and the version at PG is not even marked as to which edition > it conforms with. As for this example, the person who did it first may not have had any idea of the difference in the second. . .that's the purview of the person who does it second. . .they can expound on the differences in the second, and even attach such to the first file. As for identifying the eBooks with a particular paper edition, I think this should only be done in specific cases where the editions are known to be substantially different for reasons given in the newer editions. We did this with Darwin, Shakespeare, etc., but I don't see the need to do it in cases in which the differences are all likely to be in typographical errors, margination, pagination, and other publishing items, rather than in the source material. Michael From hart at pglaf.org Mon Jan 3 08:52:55 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 3 08:52:57 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <23279690.1104768495951.JavaMail.root@wamui01.slb.atl.earthlink.net> References: <23279690.1104768495951.JavaMail.root@wamui01.slb.atl.earthlink.net> Message-ID: On Mon, 3 Jan 2005, Dennis McCarthy wrote: > > Clarification: My point below was that _a few pages_ of a common work is not > particularly useful in the grand scheme of things. (A few pages is all > Google Print would offer for most books). An _entire text_ everyone has > individual access to is a different matter entirely, and why many of us > contribute to PG. My apologies, had the accent on the wrong phrase. . . . Michael > > -----Original Message----- > From: Michael Hart > Sent: Jan 3, 2005 10:57 AM > To: Project Gutenberg Volunteer Discussion > Subject: Re: Re: [gutvol-d] !@!Googleberg eBooks > > > >> (How important, really, is it to look at just a few pages of a book that is >> in most public libraries and many book stores?) > > Well. . .this brings us to the entire point of why have Project Gutenberg? > > Why give people an entire home library of eBooks that are "in most public > libraries and many book stores?" > > * > > I say the answer is simply individual access rather than public access. > > Of course, Ray Bradbury VIOLENTLY disagrees with me here, and I understand > why he does. . .he believes in the social experience of libraries. > > > > --------------------------- > Dennis McCarthy > nihil_obstat@mindspring.com > From nwolcott at dsdial.net Mon Jan 3 12:53:55 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Mon Jan 3 12:54:19 2005 Subject: [gutvol-d] Am I still subscribed? References: <300811F0-5253-11D9-ABD1-000A956D5546@mac.com> Message-ID: <001901c4f1d6$5fc72ac0$9f9495ce@gw98> Am I still subscribed ? N. Wolcott nwolcott2@post.harvard.edu ----- Original Message ----- From: David A. Desrosiers To: Project Gutenberg Volunteer Discussion Sent: Monday, December 20, 2004 2:35 AM Subject: Re: [gutvol-d] 'CDDB' for Gutenberg texts > > > It occurs to me that it would be useful if there were something for > > Gutenberg e-texts akin to the CDDB database for MP3s. > > You mean like the RDF catalog of all of the Gutenberg texts? > > http://gutenberg.net/browse/rdf/catalog.rdf.bz2 > > I've posted perl here before that splits this apart and > imports it into SQL in about 8 lines of code. Search the archives. > > > David A. Desrosiers > desrod@gnu-designs.com > http://gnu-designs.com > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From gbnewby at pglaf.org Mon Jan 3 13:50:22 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jan 3 13:50:24 2005 Subject: [gutvol-d] Am I still subscribed? In-Reply-To: <001901c4f1d6$5fc72ac0$9f9495ce@gw98> References: <300811F0-5253-11D9-ABD1-000A956D5546@mac.com> <001901c4f1d6$5fc72ac0$9f9495ce@gw98> Message-ID: <20050103215022.GA25992@pglaf.org> On Mon, Jan 03, 2005 at 03:53:55PM -0500, N Wolcott wrote: > Am I still subscribed ? N. Wolcott nwolcott2@post.harvard.edu You were able to send a message, so the answer is "yes". You can visit http://lists.pglaf.org to see/change your personal settings if you'd like. On the other hand, the email in your message text (6 lines above) is not the address you sent from (nwolcott@dsdial.net). -- Greg > ----- Original Message ----- > From: David A. Desrosiers > To: Project Gutenberg Volunteer Discussion > Sent: Monday, December 20, 2004 2:35 AM > Subject: Re: [gutvol-d] 'CDDB' for Gutenberg texts > > > > > > > It occurs to me that it would be useful if there were something for > > > Gutenberg e-texts akin to the CDDB database for MP3s. > > > > You mean like the RDF catalog of all of the Gutenberg texts? > > > > http://gutenberg.net/browse/rdf/catalog.rdf.bz2 > > > > I've posted perl here before that splits this apart and > > imports it into SQL in about 8 lines of code. Search the archives. > > > > > > David A. Desrosiers > > desrod@gnu-designs.com > > http://gnu-designs.com > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From jonathan_ingram at yahoo.com Mon Jan 3 14:00:29 2005 From: jonathan_ingram at yahoo.com (Jonathan Ingram) Date: Mon Jan 3 14:00:38 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: Message-ID: <20050103220029.86577.qmail@web41704.mail.yahoo.com> --- Michael Hart wrote: > As for identifying the eBooks with a particular paper edition, > I think this should only be done in specific cases where the > editions are known to be substantially different for reasons > given in the newer editions. > > We did this with Darwin, Shakespeare, etc., but I don't see > the need to do it in cases in which the differences are all > likely to be in typographical errors, margination, pagination, > and other publishing items, rather than in the source material. Most of us at DP disagree with you on this, and happily the whitewashers are now keeping the edition information that we add to the files we produce, instead of removing it. An increasing number of DP-produced texts (and, since DP produces the overwhelming majority of content contributed to PG, an increasing number of PG's recent texts) make note of edition information and page numbers at the very least. Hopefully once we move to the next iteration of our proofreading process we will be able to keep more information -- including markup of words/phrases which are missing or otherwise hard to read in the original. Several of DP's content providers, myself included, intend over the new few years to find decent editions of works already in PG, but which are not in a state that we would find acceptable if we were proofreading it ourselves (this includes most of the first few thousand texts). Hopefully over time we can update all PG's content to a standard we're happy with. -- Jon Ingram __________________________________ Do you Yahoo!? Yahoo! Mail - 250MB free storage. Do more. Manage less. http://info.mail.yahoo.com/mail_250 From joshua at hutchinson.net Mon Jan 3 14:14:19 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Mon Jan 3 14:14:29 2005 Subject: [gutvol-d] !@!Googleberg eBooks Message-ID: <20050103221419.821394F4B2@ws6-5.us4.outblaze.com> ----- Original Message ----- From: "Jonathan Ingram" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] !@!Googleberg eBooks Date: Mon, 3 Jan 2005 14:00:29 -0800 (PST) > > Several of DP's content providers, myself included, intend over the new few > years to find decent editions of works already in PG, but which are not in a > state that we would find acceptable if we were proofreading it ourselves (this > includes most of the first few thousand texts). Hopefully over time we can > update all PG's content to a standard we're happy with. > > -- > Jon Ingram > > As one of the other big content providers over at DP, I can say that Jon's word speak for me perfectly. There is NO reason to exclude the information, but PLENTY of reason to keep it. It was pretty much a no brainer for me. Josh From gbnewby at pglaf.org Mon Jan 3 14:38:49 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jan 3 14:38:50 2005 Subject: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <20050103221419.821394F4B2@ws6-5.us4.outblaze.com> References: <20050103221419.821394F4B2@ws6-5.us4.outblaze.com> Message-ID: <20050103223849.GC26634@pglaf.org> On Mon, Jan 03, 2005 at 05:14:19PM -0500, Joshua Hutchinson wrote: > > Several of DP's content providers, myself included, intend over the new few > > years to find decent editions of works already in PG, but which are not in a > > state that we would find acceptable if we were proofreading it ourselves (this > > includes most of the first few thousand texts). Hopefully over time we can > > update all PG's content to a standard we're happy with. > > > > -- > > Jon Ingram > > > > > > As one of the other big content providers over at DP, I can say that Jon's word speak for me perfectly. There is NO reason to exclude the information, but PLENTY of reason to keep it. It was pretty much a no brainer for me. > > Josh I'm just writing to confirm that current practice/policy is to leave such information in the eBooks, however they are provided by the producers. This might include publisher name, location and date, as well as other items (i.e., from the verso page). There are a few things that are still discouraged (such as including the original copyright statement, say from pre-1923, which could confuse readers), but overall there is no prohibition on keeping edition information from the dead trees source(s). Note that this does NOT mean the PG eBooks from such sources must adhere 100% to those single sources. If we get reasonable corrections from other sources, or unknown sources, we will apply them. As has been often stated here, people who are sticklers for adherance to particular printed sources (errata and all) are welcome to start their own eBook project (and we'll even help, per http://gutenberg.org/about). -- Greg From hart at pglaf.org Tue Jan 4 09:20:04 2005 From: hart at pglaf.org (Michael Hart) Date: Tue Jan 4 09:20:06 2005 Subject: !@!Re: [gutvol-d] !@!Googleberg eBooks In-Reply-To: <20050103220029.86577.qmail@web41704.mail.yahoo.com> References: <20050103220029.86577.qmail@web41704.mail.yahoo.com> Message-ID: On Mon, 3 Jan 2005, Jonathan Ingram wrote: > > --- Michael Hart wrote: >> As for identifying the eBooks with a particular paper edition, >> I think this should only be done in specific cases where the >> editions are known to be substantially different for reasons >> given in the newer editions. >> >> We did this with Darwin, Shakespeare, etc., but I don't see >> the need to do it in cases in which the differences are all >> likely to be in typographical errors, margination, pagination, >> and other publishing items, rather than in the source material. > > Most of us at DP disagree with you on this, and happily the whitewashers are > now keeping the edition information that we add to the files we produce, > instead of removing it. An increasing number of DP-produced texts (and, since > DP produces the overwhelming majority of content contributed to PG, an > increasing number of PG's recent texts) make note of edition information and > page numbers at the very least. Hopefully once we move to the next iteration > of our proofreading process we will be able to keep more information -- > including markup of words/phrases which are missing or otherwise hard to read > in the original. > Several of DP's content providers, myself included, intend over the new few > years to find decent editions of works already in PG, but which are not in a > state that we would find acceptable if we were proofreading it ourselves > (this includes most of the first few thousand texts). Hopefully over time we > can update all PG's content to a standard we're happy with. > > -- > Jon Ingram There is room for nearly everyone in Project Gutenberg. . . . DP is more than encouraged to keep working with this philosophy. This does not stop our encouragement of others who work on eBooks with other philosophies. We are currently hoping to increase our level of cooperation with Brewster Kahle and the Internet Archive [of which I was once the only surviving member when it was nearly extinct] & with John Mark Ockerbloom's Online Book Pages. Some of their books we can undoubtedly work on to increase the standards as mentioned above, but quite possibly it would be more polite to do it through their sites first or in some kind of simultanous release of new eBooks. . .and let them make the decision what relationship the new versions should have to the old. Michael From ag737 at freenet.carleton.ca Tue Jan 4 12:05:23 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Tue Jan 4 12:05:32 2005 Subject: [gutvol-d] PG-50/70? Message-ID: OK, once I again I issue the call: when are we -- the collective we -- going to set up Project Gutenbergs, or similar, to take advantage of (and suffer the disadvantage of) life+ copyright rules? Especially WRT life+50, time is of the essence... Anyone? Buehler? I don't have the technical or financial resources to do this. Someone in a life+50 country must, though. Where are you? Can we get something, even if only a shell, up and running by month's end? At least insofar as Canadian life+50 law is concerned, I will GLADLY offer to volunteer to do copyright clearances. I'll discuss my qualifications privately with anyone who may need to know them. From joshua at hutchinson.net Tue Jan 4 12:16:22 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jan 4 12:16:30 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <20050104201622.6883C4F570@ws6-5.us4.outblaze.com> I'm confused. Is there a problem with PG-Australia (life+50) or PG-Europe (life+70)? PG-Australia: http://gutenberg.net.au/ PG-Europe: http://www.gutenberg.nl/ Josh ----- Original Message ----- From: "Wallace J.McLean" To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] PG-50/70? Date: Tue, 04 Jan 2005 15:05:23 -0500 > > OK, once I again I issue the call: when are we -- the collective we -- > going to set up Project Gutenbergs, or similar, to take advantage of > (and suffer the disadvantage of) life+ copyright rules? > > Especially WRT life+50, time is of the essence... > > Anyone? Buehler? > > I don't have the technical or financial resources to do this. Someone > in a life+50 country must, though. Where are you? > > Can we get something, even if only a shell, up and running by month's > end? > > At least insofar as Canadian life+50 law is concerned, I will GLADLY > offer to volunteer to do copyright clearances. I'll discuss my > qualifications privately with anyone who may need to know them. > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From Gutenberg9443 at aol.com Tue Jan 4 12:38:47 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Tue Jan 4 12:39:01 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <13e.9e84a12.2f0c58d7@aol.com> In a message dated 1/4/2005 1:17:05 PM Mountain Standard Time, joshua@hutchinson.net writes: < References: <13e.9e84a12.2f0c58d7@aol.com> Message-ID: <41DB070A.10101@uiuc.edu> Few countries have gone past life+70. Those that have: Life+75: Guatemala Samoa Life+80: Colombia Life+99: Ivory Coast Life+100: Mexico If there is impending legislation in the US or Europe to exceed life+70, I haven't heard widespread complaint about it. -- RS From shalesller at writeme.com Tue Jan 4 13:23:43 2005 From: shalesller at writeme.com (D. Starner) Date: Tue Jan 4 13:23:52 2005 Subject: [gutvol-d] !@!Googleberg eBooks Message-ID: <20050104212343.7ABEB4BDAB@ws1-1.us4.outblaze.com> "Michael Hart" writes: > Obviously, any scholar will be able to figure out which editions we > have used without much effort, and those who are not scholars won't > care which editions we used because they don't care if it is: There are huge differences between some editions, including major plot changes between the two editions of Frankenstein, and major bowlerdization in editions of other books. Yes, non-scholars care. I personally don't demand typo for typo, but I don't want a portmanteau of texts that the author never saw and no editor ever wrote. > "To be or not to be." > "To be, or not to be." > "To be; or not to be." > or > "To be: or not to be." That's a strawman. In reality, there are large differences in Shakespearean text depending on which original text you take it from, and many people could be interested in which edition it was from. > To them, that is not the question, and a discussion of that question > would shuffle them off this mortal coil into the land of dreams. Behold the power of skimming and outright skipping of boring text. > As for this example, the person who did it first may not have had > any idea of the difference in the second That's all the more reason to always keep edition information. > We did this with Darwin, Shakespeare, etc., Really? Because I don't see it for Shakespeare, except for the first folio editions. (Now I see that the Collins editions have the name in the header but why not put the year? The introduction? The editor name? All things of interest to the non-scholar reader.) (I think I know the answer here, but can we get rid of the World Library Editions? We have replacement editions, and if we need more, there's numerous editions of Shakespeare we could use. Say the word, and I'll start scanning new editions of Shakespeare for DP to replace these.) > but I don't see > the need to do it in cases in which the differences are all > likely to be in typographical errors, margination, pagination, > and other publishing items, rather than in the source material. The major question, how can we tell the difference? I don't insist on pedantism, but I'd rather add the information to many books, so we get the "Leaves of Grass" and the "20000 Lieues sous les mers" right, instead of only adding to the books we know for sure had distinct editions and missing those books. In the same direction, non-scholarly readers want to know when the book they were reading was published and sometimes where. Is this Civil War novel written after WWI, with a view of the horrors of war? Was it written shortly after the Civil War and was published in New York, or Atlanta? Was it written during the Civil War, and then New York or Atlanta would really make a difference. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From gbnewby at pglaf.org Tue Jan 4 13:42:32 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jan 4 13:42:35 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <13e.9e84a12.2f0c58d7@aol.com> References: <13e.9e84a12.2f0c58d7@aol.com> Message-ID: <20050104214232.GA10362@pglaf.org> The US is 95 or 120 years, per the 1998 copyright extensions. Our copyright rules describe US guidelines: http://gutenberg.org/howto/copyright-howto The best resource I know of for worldwide copyright durations is at the OnlineBooks page: http://onlinebooks.library.upenn.edu/okbooks.html -- Greg On Tue, Jan 04, 2005 at 03:38:47PM -0500, Gutenberg9443@aol.com wrote: > In a message dated 1/4/2005 1:17:05 PM Mountain Standard Time, > joshua@hutchinson.net writes: > > < <<(life+50) or PG-Europe (life+70)? > > > > Josh, the best I can tell right now, Australia is > life+50 BUT a law has just been passed, though not to my knowledge ratified > or else ratified but not yet going into effect, to put Australia in the same > area as the other countries. > > Also to the best of my knowledge, other countries have now moved to life+95, > and if the US hasn't done so yet it is in the process of doing so. > > Even speaking as a writer, guarding my copyrights fiercely because I think > that if the publisher and the illustrator are still making money from my > creation I also should make money from it, I think life +95 is absurd. Life +25 > should be quite adequate for anybody. There are many people in this community > who consider even that to be absurd. > > If my reply is incorrect, please somebody correct me. Flaming is not > necessary. Really. > > Anne > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Tue Jan 4 13:47:24 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jan 4 13:47:25 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050104214232.GA10362@pglaf.org> References: <13e.9e84a12.2f0c58d7@aol.com> <20050104214232.GA10362@pglaf.org> Message-ID: <20050104214724.GB10362@pglaf.org> On Tue, Jan 04, 2005 at 01:42:32PM -0800, Greg Newby wrote: > The US is 95 or 120 years, per the 1998 copyright > extensions. Our copyright rules describe US guidelines: > http://gutenberg.org/howto/copyright-howto > > The best resource I know of for worldwide copyright > durations is at the OnlineBooks page: > http://onlinebooks.library.upenn.edu/okbooks.html Oops - sorry, I forgot to mention: Yes, Australia has moved to life+70 (essentially the same as the EU). Effective January 1 2005 or thereabouts. It is *not* retroactive, at least for most works. Sounds like this basically means that nothing new will enter the public domain in Australia for at least 20 years, but stuff that is already public domain will stay there. Steven Thomas has sent some newspaper article links about it to this mailing list, I think. But I don't know of a link to the actual text of the law for complete details. > -- Greg > > On Tue, Jan 04, 2005 at 03:38:47PM -0500, Gutenberg9443@aol.com wrote: > > In a message dated 1/4/2005 1:17:05 PM Mountain Standard Time, > > joshua@hutchinson.net writes: > > > > < > <<(life+50) or PG-Europe (life+70)? > > > > > > > > Josh, the best I can tell right now, Australia is > > life+50 BUT a law has just been passed, though not to my knowledge ratified > > or else ratified but not yet going into effect, to put Australia in the same > > area as the other countries. > > > > Also to the best of my knowledge, other countries have now moved to life+95, > > and if the US hasn't done so yet it is in the process of doing so. > > > > Even speaking as a writer, guarding my copyrights fiercely because I think > > that if the publisher and the illustrator are still making money from my > > creation I also should make money from it, I think life +95 is absurd. Life +25 > > should be quite adequate for anybody. There are many people in this community > > who consider even that to be absurd. > > > > If my reply is incorrect, please somebody correct me. Flaming is not > > necessary. Really. > > > > Anne > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Tue Jan 4 13:48:53 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jan 4 13:48:55 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: References: Message-ID: <20050104214853.GC10362@pglaf.org> On Tue, Jan 04, 2005 at 03:05:23PM -0500, Wallace J.McLean wrote: > OK, once I again I issue the call: when are we -- the collective we -- > going to set up Project Gutenbergs, or similar, to take advantage of > (and suffer the disadvantage of) life+ copyright rules? > > Especially WRT life+50, time is of the essence... > > Anyone? Buehler? > > I don't have the technical or financial resources to do this. Someone > in a life+50 country must, though. Where are you? > > Can we get something, even if only a shell, up and running by month's > end? > > At least insofar as Canadian life+50 law is concerned, I will GLADLY > offer to volunteer to do copyright clearances. I'll discuss my > qualifications privately with anyone who may need to know them. PGCanada: see http://lists.pglaf.org for their mailing list, archives, and people involved. They've been trying to get started. -- Greg From marcello at perathoner.de Tue Jan 4 11:37:21 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Jan 4 14:25:03 2005 Subject: [gutvol-d] Google Search/PG Header Suggestion In-Reply-To: <22389068.1104767537436.JavaMail.root@wamui01.slb.atl.earthlink.net> References: <22389068.1104767537436.JavaMail.root@wamui01.slb.atl.earthlink.net> Message-ID: <41DAF071.4080900@perathoner.de> Dennis McCarthy wrote: > The current PG header does not actually have the word "book" in it. > The term "ebook" is used a couple times. If the header could be > rewritten to include the word "book" somewhere, then PG texts may > come up higher in search results, when browers are searching for > on-line texts by title or author. Searches by author or title bring up the "bibrec" page because it is far shorter than the whole etext and because it has the search words wrapped in

which gives them more weight. The bibrec pages already have the keyword "book" in the html header. -- Marcello Perathoner webmaster@gutenberg.org From holden.mcgroin at dsl.pipex.com Tue Jan 4 14:26:05 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Tue Jan 4 14:26:19 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <13e.9e84a12.2f0c58d7@aol.com> References: <13e.9e84a12.2f0c58d7@aol.com> Message-ID: <41DB17FD.5060608@dsl.pipex.com> Gutenberg9443@aol.com wrote: > Even speaking as a writer, guarding my copyrights fiercely because I > think that if the publisher and the illustrator are still making money > from my creation I also should make money from it, I think life +95 is > absurd. Life +25 should be quite adequate for anybody. There are many > people in this community who consider even that to be absurd. I agree wholeheartedly. I seem to remember the copyright office in Australia last year recommending that copyrights be SHORTENED rather than LENGTHENED. The benefits to authors from extending copyrights from Life + 50 to Life + 70 are pretty much non-existent. You could perhaps make the argument that authors may do more work to see that their children or grandchildren are well looked after after their death. However, when you start getting into Life + 70 territory, it's likely that the only beneficiaries will be a generation of descendents who weren't even born when the author was alive!!! The people who really benefit from copyrights are the publishers. Longer copyrights mean longer periods without competition from low-cost public domain publishers. It's perhaps unfortunate that at the governmental level, the people most represented (publishers) are the people who least need to be represented. If governments are given the choice to please high-powered corporations with minimal complaint from the electorate, they'll go right ahead and do it. It's even worse when, as in the case of Australia, the government may not even _want_ to impose longer copyrights but they have their hands forced through "Free Trade Agreements" with more powerful countries. When given a choice between sanctions from the richest country in the world and increasing copyrights by _another_ 20 years, which one would you choose? The sad part of it is, governments are increasing copyrights against the advice of their copyright offices, without considering the choices which will have the best impact on their country (which government has seriously taken advice on the "optimal" length of copyright?). These decisions are adversely affecting consumers every day but they're hurried through because nobody cares. Cheers, Holden From stephen.thomas at adelaide.edu.au Tue Jan 4 14:33:09 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Tue Jan 4 14:33:22 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <13e.9e84a12.2f0c58d7@aol.com> References: <13e.9e84a12.2f0c58d7@aol.com> Message-ID: <41DB19A5.305@adelaide.edu.au> I posted something about this on December 12, to the Book People list: "THE final element of the Australia-US free trade agreement (FTA) has passed through federal parliament. [...] The copyright aspects of the FTA [...] come into force on January 1 [...] The term of protection for copyright material was extended by 20 years." So Australia is now Life+70 -- for those authors who died after 1954. It's my understanding -- and I'll keep adding books until someone tells me otherwise -- that the change to our copyright law is NOT retrospective, so any authors who died before 1955 are still available. Including Margaret Mitchell. Steve Gutenberg9443@aol.com wrote: > In a message dated 1/4/2005 1:17:05 PM Mountain Standard Time, > joshua@hutchinson.net writes: > > < > <<(life+50) or PG-Europe (life+70)? > > Josh, the best I can tell right now, Australia is life+50 BUT a law > has just been passed, though not to my knowledge ratified or else > ratified but not yet going into effect, to put Australia in the same > area as the other countries. > > Also to the best of my knowledge, other countries have now moved to > life+95, and if the US hasn't done so yet it is in the process of > doing so. > > Even speaking as a writer, guarding my copyrights fiercely because I > think that if the publisher and the illustrator are still making > money from my creation I also should make money from it, I think life > +95 is absurd. Life +25 should be quite adequate for anybody. There > are many people in this community who consider even that to be > absurd. > > If my reply is incorrect, please somebody correct me. Flaming is not > necessary. Really. > > Anne > > > ------------------------- > > _______________________________________________ gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ From Gutenberg9443 at aol.com Tue Jan 4 14:35:44 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Tue Jan 4 14:36:00 2005 Subject: [gutvol-d] !@!Googleberg eBooks, mostly OT Message-ID: In a message dated 1/4/2005 2:24:00 PM Mountain Standard Time, shalesller@writeme.com writes: >There are huge differences between some editions, >including major >plot changes between the two editions of Frankenstein, >and major >bowlerdization in editions of other books. Yes, non->scholars care. >I personally don't demand typo for typo, but I don't want >a portmanteau >of texts that the author never saw and no editor ever >wrote. Are both Frankensteins posted now? I'd love to compare the two. Whoever, whenever, gets to do T. H. White's Arthur cycle is going to go nuts--the books exist in at least five extremely different versions. I did a paper on it in grad school. I am devoutly thankful that I'll be dead before anybody has to do that task. The rest of this is partly OT: Recently someone accused me of reading Tom Swift, so out of sheer orneriness I went and did it. I DISTINCTLY remember, when I was twelve and my brother Bill was nine, Bill throwing a fit about all the things that were being blamed on the confederates, and I had to explain to him the difference between the Confederates and the confederates. Almost all of those references are gone now. In fact, the book that upset him the most--the one in which Tom Swift is hunting platinum in Siberia--the word does not appear once. I haven't read them for about 45 years (except for three that I read a few years ago when I was too ill to do anything at all requiring thought and had not yet discovered that The Shadow and Doc Savage were online), so I was amazed at how greedy and unsavvy Tom Swift is. His second-best friend, Mr. Damon, has been kidnapped, Our Hero has promised Mrs. Damon that he will get him back, and then Our Hero placidly goes back to work on his new invention. He almost never reports a crime to the police. And no matter what country he's in, he sees nothing wrong with taking gold, platinum, artifacts, and whatever the heck he wants. I don't know whether the author was that ignorant or just plain didn't care. Maybe that was legal in some countries about WWI era, but I doubt it. It certainly was illegal in Egypt. Joke of the day: Yesterday we were walking home from the grocery store and my husband, seeing a line of birds perched on an electric cable, said, "See, even the birds are online!" Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050104/7f2fa876/attachment.html From Gutenberg9443 at aol.com Tue Jan 4 14:47:49 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Tue Jan 4 14:48:02 2005 Subject: [gutvol-d] PG-50/70? Message-ID: In a message dated 1/4/2005 3:26:37 PM Mountain Standard Time, holden.mcgroin@dsl.pipex.com writes: The people who really benefit from copyrights are the publishers. Longer copyrights mean longer periods without competition from low-cost public domain publishers. Again, yea verily. That's one of the reasons why, as soon as a book goes out of print, the writer should ask for copyright reversion. Too many people don't think to do that. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050104/15794829/attachment.html From shimmin at uiuc.edu Tue Jan 4 16:27:55 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Tue Jan 4 16:28:10 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DB17FD.5060608@dsl.pipex.com> References: <13e.9e84a12.2f0c58d7@aol.com> <41DB17FD.5060608@dsl.pipex.com> Message-ID: <41DB348B.4010406@uiuc.edu> > The sad part of it is, governments are increasing copyrights against the > advice of their copyright offices, without considering the choices which > will have the best impact on their country (which government has > seriously taken advice on the "optimal" length of copyright?). These > decisions are adversely affecting consumers every day but they're > hurried through because nobody cares. The sad truth is that when approached from an economic point of view, the optimal length is so short that it seems absurd, and nobody takes you seriously after they hear the result. Here's the analysis: Let's say that we have a work of enduring reputation, so that the right to publish it is worth a perpetuity of some annual income A. The copyright then has a present value of A/r, where r is the interest rate. We wish to transfer the copyright from the rights holder to the public when the public has paid the rights holder A/r. The public pays the rights holder A per year, and if this money accrues interest at the same rate r, then (omitting a pile of algebra), then the public will have paid the rights holder A/r after N years, where N=log(2)/log(1+r). Those with some accounting knowledge will recognize that the optimal copyright term at some interest rate is the same time as the doubling time of money at that interest rate: interest copyright rate term 2% 35.0 yr 3% 23.4 yr 4% 17.7 yr 5% 14.2 yr 6% 11.9 yr 7% 10.2 yr It's interesting to note that the copyright term of the legislation that started the Anglo-American copyright tradition, the 1710 Statue of Anne, hit this range on the nose with a 14 year term (5% interest). It got the number from the 1624 Statute of Monopolies, which limited royal monopolies to 14 years. In patent law, where the economic disadvantages of too long a patent term are quite clear, most patent offices have kept patent terms in the 14-20 year range, which seems reasonable, looking at the table above. In copyright law, the Continental jurists won the day, and things have gotten out of hand ever since. -- RS From davedoty at hotmail.com Tue Jan 4 17:16:22 2005 From: davedoty at hotmail.com (Dave Doty) Date: Tue Jan 4 17:17:13 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050104214232.GA10362@pglaf.org> Message-ID: >From: Greg Newby >The US is 95 or 120 years, per the 1998 copyright extensions. Or Life+70, as pointed out at the web site you mentioned. I think the original poster probably conflated "Life+70" and "95" into "Life+95" Dave Doty From brad at chenla.org Tue Jan 4 19:55:53 2005 From: brad at chenla.org (Brad Collins) Date: Tue Jan 4 19:59:19 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: (Gutenberg's message of "Tue, 4 Jan 2005 17:47:49 EST") References: Message-ID: Gutenberg9443@aol.com writes: > In a message dated 1/4/2005 3:26:37 PM Mountain Standard Time, > holden.mcgroin@dsl.pipex.com writes: > The people who really benefit from copyrights are the > publishers. Longer copyrights mean longer periods without > competition from low-cost public domain publishers. > > > Again, yea verily. That's one of the reasons why, as > soon as a book goes out of print, the writer should > ask for copyright reversion. Too many people don't > think to do that. > But this is thinking in terms of traditional physical media publishing economics. Wired had an article called the Long Tail (http://www.wired.com/wired/archive/12.10/tail.html) which shows how the back-catalog of a publisher which only sells a small number of copies per year each, cumultively is larger than the top sellers or hits at any one time. This means it's economically viable to keep copyright on everything as long as you have some sort of just-in-time distribution or publishing technology in place so publishers don't get bogged down by keeping large inventories or production costs. This is good news, in that it means that books won't go out of print. But it's bad news because there is no reason for them ever to enter the public domain. So the same technology that let's us give away books also has created an economy of the Long Tail which has enabled publishers to make a profit from unprofitable properties. Amazon certainly understands this, and I think over time, Google will be on the side of retaining eternal copyright (and an iTune-like world) rather than shorter copyright (and a PG-centric world) and a vibrant public domain. I fear a renewed bout of copyright hoarding later this decade and even more consolidation of copyright material into the hands of a handful of big corporations. This is the writing on the wall at the moment, but all of this is still turned on it's head by P2P and the underground trend towards FreeNet-like anonymous and very secure networks. And we still need to see how emerging super-economies like India and China will shake things up. Here in Thailand, iTunes is _very_ expensive, almost 40 baht for a single song! Legal VCDs[1] (for hollywood movies) cost only about 120 baht. Two songs at iTunes cost more than a meal at McDonalds! Any not many people can afford (or actually even like) McDonalds. The knife cuts both ways.... b/ Footnotes: [1] VCD's are MPEG1 Video which are very big in Asia. MPEG1 is about the same quality as a VHS tape and a movie fits on two CDs. VCD players are built into practically every music CD player, and most music in Asia is now distributed as Karaoke with video tracks for each song. DVD's are starting to catch on big, but they cost twice as much as VCD's and the players are only now coming down low enough in price for the average person to buy. Once VCDs dropped in price down below 150 baht it gutted the illegal market. I've seen this in Hong Kong as well. It's a bit strange sometimes. Even the illegal stalls on the streets will sell a combination of legal VCDs and a shrinking amount of illegal stuff which now are usually poor quality stuff which is still in the cinema. They are all priced about the same. -- Brad Collins , Bangkok, Thailand From jonathan_ingram at yahoo.com Wed Jan 5 00:52:24 2005 From: jonathan_ingram at yahoo.com (Jonathan Ingram) Date: Wed Jan 5 00:52:44 2005 Subject: Shakespeare in DP (was Re: [gutvol-d] !@!Googleberg eBooks) In-Reply-To: <20050104212343.7ABEB4BDAB@ws1-1.us4.outblaze.com> Message-ID: <20050105085224.56099.qmail@web41711.mail.yahoo.com> --- "D. Starner" wrote: > (I think I know the answer here, but can we get rid of the > World Library Editions? We have replacement editions, and > if we need more, there's numerous editions of Shakespeare > we could use. Say the word, and I'll start scanning new editions > of Shakespeare for DP to replace these.) Please do! I'll be quite happy to shepherd these through the DP process if you don't have the time. We've already put some Shakespeare through DP -- the 'Bad Quarto' of Hamlet[1] was our 2000th book, for example. [1] http://www.gutenberg.org/etext/9077 -- Jon Ingram __________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail From traverso at dm.unipi.it Wed Jan 5 03:07:53 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Wed Jan 5 03:05:53 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: (ag737@freenet.carleton.ca) References: Message-ID: <200501051107.j05B7rF22738@posso.dm.unipi.it> >>>>> "Wallace" == Wallace J McLean writes: Wallace> OK, once I again I issue the call: when are we -- the Wallace> collective we -- going to set up Project Gutenbergs, or Wallace> similar, to take advantage of (and suffer the Wallace> disadvantage of) life+ copyright rules? Wallace> Especially WRT life+50, time is of the essence... Wallace> Anyone? Buehler? Project Gutenberg Europe is starting at http://pge.rastko.net ; it is located in serbia, hence will work as life+50 soon: currently it is just a mirror of PG, new material from DP-EU will be uploaded quite soon. Carlo Traverso From tb at baechler.net Wed Jan 5 06:41:56 2005 From: tb at baechler.net (Tony Baechler) Date: Wed Jan 5 06:41:49 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <200501051107.j05B7rF22738@posso.dm.unipi.it> References: Message-ID: <5.2.0.9.0.20050105064011.02019a50@snoopy2.trkhosting.com> Hi. Are the new posts from PG Europe going to be announced on the "posted" list as is currently done with books from PG of Australia? Also, what about getting a gutenberg.eu or gutenberg.int domain? At 12:07 PM 1/5/2005 +0100, you wrote: >Project Gutenberg Europe is starting at http://pge.rastko.net ; it is >located in serbia, hence will work as life+50 soon: currently it is >just a mirror of PG, new material from DP-EU will be uploaded quite >soon. From jeroen.mailinglist at bohol.ph Wed Jan 5 10:32:21 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Wed Jan 5 10:31:40 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DB348B.4010406@uiuc.edu> References: <13e.9e84a12.2f0c58d7@aol.com> <41DB17FD.5060608@dsl.pipex.com> <41DB348B.4010406@uiuc.edu> Message-ID: <41DC32B5.6030105@bohol.ph> Interesting aproach, but we have to translate it into a reality where lobbying has locked up life+50 into a number of international treaties that will take time to change (but MUST in due time, if the public starts to understands how they are being robbed). For the inbetween, I suggest a smart taxation scheme. Most countries have a sales tax or value added tax in the order of 10 to 20 percent. We could now offer two ways of paying that tax, either the normal way, that is, on sales as they are being made, or payment as a much earlier dedication to the public domain, according to some formula similar to that of Robert. Since most works don't have an enduring reputation, most publishers (except those of established classics) will probably go for the short term benefit. To enter the scheme, a registration and a contract with a copyright office will be required. We could try to make the scheme compulsory for DRM-ed works (together with a deposit of an unencumbered copy in a copyright office), since legislation around DRM is still on the drawing table in many countries. Joining the scheme will be (due to international obligations) be voluntarly, but could be made compulsory once these obstacles are taken out of the way; such that life+50 (or 70) will only remain in place for unpublished works, where I have much less trouble with the long term. I will further work out this idea in a document I am working on, called: "Cultural Heritage and Copyright, a misbalancing act"... Jeroen Hellingman. Robert Shimmin wrote: > > The sad truth is that when approached from an economic point of view, > the optimal length is so short that it seems absurd, and nobody takes > you seriously after they hear the result. Here's the analysis: > > Let's say that we have a work of enduring reputation, so that the > right to publish it is worth a perpetuity of some annual income A. The > copyright then has a present value of A/r, where r is the interest > rate. We wish to transfer the copyright from the rights holder to the > public when the public has paid the rights holder A/r. > > The public pays the rights holder A per year, and if this money > accrues interest at the same rate r, then (omitting a pile of > algebra), then the public will have paid the rights holder A/r after N > years, where N=log(2)/log(1+r). Those with some accounting knowledge > will recognize that the optimal copyright term at some interest rate > is the same time as the doubling time of money at that interest rate: > > interest copyright > rate term > 2% 35.0 yr > 3% 23.4 yr > 4% 17.7 yr > 5% 14.2 yr > 6% 11.9 yr > 7% 10.2 yr > > It's interesting to note that the copyright term of the legislation > that started the Anglo-American copyright tradition, the 1710 Statue > of Anne, hit this range on the nose with a 14 year term (5% > interest). It got the number from the 1624 Statute of Monopolies, > which limited royal monopolies to 14 years. In patent law, where the > economic disadvantages of too long a patent term are quite clear, most > patent offices have kept patent terms in the 14-20 year range, which > seems reasonable, looking at the table above. In copyright law, the > Continental jurists won the day, and things have gotten out of hand > ever since. > From ke at gnu.franken.de Wed Jan 5 10:50:31 2005 From: ke at gnu.franken.de (Karl Eichwalder) Date: Wed Jan 5 11:38:51 2005 Subject: [gutvol-d] German texts and the m-dash Message-ID: Thanks for posting Pater Filucius by Wilhelm Busch (http://www.gutenberg.org/1/4/3/4/14340). Unfortunately, the m-dash treatment is wrong--spaces are missing. Is this a common problem for texts prepared by DP-PG? -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C From sly at victoria.tc.ca Wed Jan 5 11:50:46 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jan 5 11:50:55 2005 Subject: [gutvol-d] German texts and the m-dash In-Reply-To: References: Message-ID: On Wed, 5 Jan 2005, Karl Eichwalder wrote: > Thanks for posting Pater Filucius by Wilhelm Busch > (http://www.gutenberg.org/1/4/3/4/14340). Unfortunately, the m-dash > treatment is wrong--spaces are missing. Is this a common problem for > texts prepared by DP-PG? For quite a while now, it has been considered a PG standard to have no spaces around an emdash. I have wondered before if there is a call for treating this differently in various languages. I remember clearly an exchange of email with a new white-washer about spaces around emdashes in a German text I was submitting. I was arguing that many other German texts in PG and other places seemed to have the spaces; he was arguing that the files should be prepared "to standard" before being submitted. Andrew From ag737 at freenet.carleton.ca Wed Jan 5 12:15:13 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 5 12:15:23 2005 Subject: [gutvol-d] PG-50/70? Message-ID: Robert Shimmin wrote: > The sad truth is that when approached from an economic point of view, > the optimal length is so short that it seems absurd, and nobody takes > you seriously after they hear the result. Here's the analysis: > > interest copyright > rate term > 2% 35.0 yr That's very interesting... I've done other analyses, based on other grounds. On the "widows and orphans" theory, that copyright income should be protected not only for the poor starving artists/authors, but also their widows/orphans, life+about 35 years is where the law of diminishing returns kicks in: life+35 years is enough to guarantee the majority of copyrights will actually outlive the majority of the poor orphans that were left behind, assuming western lifespans. The average age of those "orphans" who are still alive when the copyrights die, would be 75 years. (Of course, no orphan would ever be less than the PLUS in life-PLUS when the copyrights expire.) Further, 35 years post publication -- let alone post mortem autoris -- is where the number of books still in print dips to less than five percent, at least in the samples I've tested this on, are still even in print, at least in Canada. That is to say, of the books first published in 1954, in 1989 only 4% and change were still in recent print, and many of those only in editions for the blind. From ag737 at freenet.carleton.ca Wed Jan 5 12:16:58 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 5 12:17:06 2005 Subject: [gutvol-d] PG-50/70? Message-ID: We need a bulwark in the English-speaking world. We really need to set up a PG-50 in Canada. Soon. Very, very soon. ----- Original Message ----- >From Tony Baechler Date Wed, 05 Jan 2005 06:41:56 -0800 To traverso@dm.unipi.it, Project Gutenberg Volunteer Discussion Subject Re: [gutvol-d] PG-50/70? Hi. Are the new posts from PG Europe going to be announced on the "posted" list as is currently done with books from PG of Australia? Also, what about getting a gutenberg.eu or gutenberg.int domain? At 12:07 PM 1/5/2005 +0100, you wrote: >Project Gutenberg Europe is starting at http://pge.rastko.net ; it is >located in serbia, hence will work as life+50 soon: currently it is >just a mirror of PG, new material from DP-EU will be uploaded quite >soon. From hart at pglaf.org Wed Jan 5 12:34:51 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 5 12:34:53 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: References: Message-ID: On Wed, 5 Jan 2005, Wallace J.McLean wrote: > Robert Shimmin wrote: > >> The sad truth is that when approached from an economic point of view, >> the optimal length is so short that it seems absurd, and nobody takes >> you seriously after they hear the result. Here's the analysis: >> >> interest copyright >> rate term >> 2% 35.0 yr > > That's very interesting... > > I've done other analyses, based on other grounds. > > On the "widows and orphans" theory, that copyright income should be > protected not only for the poor starving artists/authors, but also > their widows/orphans, life+about 35 years is where the law of > diminishing returns kicks in: life+35 years is enough to guarantee the > majority of copyrights will actually outlive the majority of the poor > orphans that were left behind, assuming western lifespans. > > The average age of those "orphans" who are still alive when the > copyrights die, would be 75 years. (Of course, no orphan would ever be > less than the PLUS in life-PLUS when the copyrights expire.) One other point about these particular scenarios: Any items that are still selling after so many years should have left behind a fortune more than large enough to support decades of descendants, simply from royalties already received. We are dealing with only the creme de la creme when we consider items still selling after any substantial length of time. Of course, some people think it is fitting and proper to pass laws the only benefit the creme de la creme, especially if they keep the lower classes from having access to anything. Let's face it, every time copyrights are extended another 20 years, that's a million books that we don't see going to the public domain. > Further, 35 years post publication -- let alone post mortem autoris -- > is where the number of books still in print dips to less than five > percent, at least in the samples I've tested this on, are still even in > print, at least in Canada. That is to say, of the books first published > in 1954, in 1989 only 4% and change were still in recent print, and > many of those only in editions for the blind. Yes, this figures in pretty well with other estimates I have heard; 4% would be the largest possible number to consider, and, depending on how many were special editions not readily available to the public, probably much less, and again we must consider that those were the best sellers, not likely to be living off government checks. mh From hart at pglaf.org Wed Jan 5 13:30:42 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 5 13:30:42 2005 Subject: !@!Re: Shakespeare in DP (was Re: [gutvol-d] !@!Googleberg eBooks) In-Reply-To: <20050105085224.56099.qmail@web41711.mail.yahoo.com> References: <20050105085224.56099.qmail@web41711.mail.yahoo.com> Message-ID: On Wed, 5 Jan 2005, Jonathan Ingram wrote: > > --- "D. Starner" wrote: >> (I think I know the answer here, but can we get rid of the >> World Library Editions? We have replacement editions, and >> if we need more, there's numerous editions of Shakespeare >> we could use. Say the word, and I'll start scanning new editions >> of Shakespeare for DP to replace these.) You won't be able to legally replace the World Library Shakespeare, because many of the source editions are still copyrighted, as Prof. George Lyman Kittredge didn't finish his complete Shakespeare until the 1930's. However, the "life +50" PGs could redo him, as he died in 1941, but the "life +70" and US copyrights are still in force. Kittredge was the best edition of its day, perhaps until the Riverside edition, which is the new standard, and should be kept as the best public domain resource, accoring the advice of most Shakespeare professors I have consulted. It's one thing to add a new version, quite something else to "get rid of" an old one. Not to mention that the World Library was VERY nice to Project Gutenberg in donating this material 10 years ago, when we only were coming up on 100 eBooks. Michael From ke at gnu.franken.de Wed Jan 5 12:24:48 2005 From: ke at gnu.franken.de (Karl Eichwalder) Date: Wed Jan 5 13:38:08 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: (Andrew Sly's message of "Wed, 5 Jan 2005 11:50:46 -0800 (PST)") References: Message-ID: Andrew Sly writes: > I remember clearly an exchange of email with a new white-washer > about spaces around emdashes in a German text I was submitting. > I was arguing that many other German texts in PG and other > places seemed to have the spaces; he was arguing that the files > should be prepared "to standard" before being submitted. Something along these line I read, too. But I thought the post-processors or white-washer would use a special switch to prepare German texts more like traditional German texts ;) For HTML I prefer "xyz — zyx" instead of "xyz--zyx". -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C From ke at gnu.franken.de Wed Jan 5 13:46:41 2005 From: ke at gnu.franken.de (Karl Eichwalder) Date: Wed Jan 5 13:46:57 2005 Subject: [gutvol-d] Re: !@!Re: Shakespeare in DP In-Reply-To: (Michael Hart's message of "Wed, 5 Jan 2005 13:30:42 -0800 (PST)") References: <20050105085224.56099.qmail@web41711.mail.yahoo.com> Message-ID: Michael Hart writes: > You won't be able to legally replace the World Library Shakespeare, > because many of the source editions are still copyrighted, as Prof. > George Lyman Kittredge didn't finish his complete Shakespeare until > the 1930's. > > However, the "life +50" PGs could redo him, as he died in 1941, > but the "life +70" and US copyrights are still in force. Editions are not "protected" that long in Germany--ten or some 20/25 years, the latter for scientific editions. Of course, editor's comments and footnotes, introductions and the like are protected life+70. -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C From shalesller at writeme.com Wed Jan 5 13:55:00 2005 From: shalesller at writeme.com (D. Starner) Date: Wed Jan 5 13:55:12 2005 Subject: [gutvol-d] Re: [gweekly] Pt2 Project Gutenberg Weekly Newsletter Message-ID: <20050105215500.E16E7164002@ws1-4.us4.outblaze.com> > And from David Widger, about #14568, Sir Gawayne and the Green Knight: > > NOTE: The Old English "yogh" characters have been translated both upper > and lower-case yoghs to digit 3's. There are Unicode allocations for > these (in HTML Ȝ and ȝ) but at present no font which implements > these. Substiting the digit 3 seemed a workable compromise which anybody > can read. The linked html "Old English 'yogh' file" uses Ȝ and > ȝ representations, and is included for users with specialist fonts. Yes, there are fonts that implement these. Even a quick search should have revealed that. Try or or or or , not to mention Code2000 which has pretty much every character in Unicode. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From hart at pglaf.org Wed Jan 5 14:00:11 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 5 14:00:12 2005 Subject: [gutvol-d] Re: !@!Re: Shakespeare in DP In-Reply-To: References: <20050105085224.56099.qmail@web41711.mail.yahoo.com> Message-ID: On Wed, 5 Jan 2005, Karl Eichwalder wrote: > Michael Hart writes: > >> You won't be able to legally replace the World Library Shakespeare, >> because many of the source editions are still copyrighted, as Prof. >> George Lyman Kittredge didn't finish his complete Shakespeare until >> the 1930's. >> >> However, the "life +50" PGs could redo him, as he died in 1941, >> but the "life +70" and US copyrights are still in force. > > Editions are not "protected" that long in Germany--ten or some 20/25 > years, the latter for scientific editions. Of course, editor's comments > and footnotes, introductions and the like are protected life+70. Is this true even in cases such as Shakespeare? Where a great deal of effort goes into choosing which lines from which publications? mh From JBuck814366460 at aol.com Wed Jan 5 14:14:23 2005 From: JBuck814366460 at aol.com (Jared Buck) Date: Wed Jan 5 14:14:36 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: Message-ID: <005001c4f373$eaf1a4b0$9bc9c7ac@jared> Hi all, It's Jared :) I'm a college student down here in California, and I've been with PG for some time as a DP and more recently as one of the newsletter editors. Prof. Hart and I have been talking about (by email) the possibility of reviving the old directory structure (ie etext 03, etext94, etc) as perhaps a classic PG website or at least an alternative site for users to browse etexts the way they did before, with the etexts organized by year. It's the structure I am most used to, and in discussions with Hart and Greg Newby, I believe a project of this scale can be undertaken (to create a separate site housing the etexts under the old directory structure. Prof. Hart also tells me there has been some discussion on this issue on the mailing list recently, but I only just joined the gutvol list a few days ago. Perhaps someone wouldn't mind filling me in on what I missed? Jared Buck ---------------------- Project Gutenberg editor http://www.gutenberg,net From jeroen.mailinglist at bohol.ph Wed Jan 5 14:20:14 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Wed Jan 5 14:19:17 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DB348B.4010406@uiuc.edu> References: <13e.9e84a12.2f0c58d7@aol.com> <41DB17FD.5060608@dsl.pipex.com> <41DB348B.4010406@uiuc.edu> Message-ID: <41DC681E.8060607@bohol.ph> I've been studying this idea a bit further, and think there is a fallacy in this reasoning, that is, you are giving the author his present value at publication, but you are not giving it at the present, but spread out of N years.... Under your assumptions, he can consume A forever from day zero with perpeptual copyright, but he will have to wait for N years (and save everything, except interest) before he can consume A forever after this deal. Since most people don't life forever, this isn't a very good deal. I think you can still make a good economical case for a copyright lasting in the order of 28 years, but that will require some more math, and statistics about sales patterns of common copyrighted works. That's why I wanted to throw in the VAT, or even some arbitrary valuation of elements borrowed from the public domain, so that we can claim a work for the public when the author has earned roughly 85 % of its value according to very conservative estimates. Jeroen Hellingman. Robert Shimmin wrote: > The sad truth is that when approached from an economic point of view, > the optimal length is so short that it seems absurd, and nobody takes > you seriously after they hear the result. Here's the analysis: > > Let's say that we have a work of enduring reputation, so that the > right to publish it is worth a perpetuity of some annual income A. The > copyright then has a present value of A/r, where r is the interest > rate. We wish to transfer the copyright from the rights holder to the > public when the public has paid the rights holder A/r. > > The public pays the rights holder A per year, and if this money > accrues interest at the same rate r, then (omitting a pile of > algebra), then the public will have paid the rights holder A/r after N > years, where N=log(2)/log(1+r). Those with some accounting knowledge > will recognize that the optimal copyright term at some interest rate > is the same time as the doubling time of money at that interest rate: > > interest copyright > rate term > 2% 35.0 yr > 3% 23.4 yr > 4% 17.7 yr > 5% 14.2 yr > 6% 11.9 yr > 7% 10.2 yr > > It's interesting to note that the copyright term of the legislation > that started the Anglo-American copyright tradition, the 1710 Statue > of Anne, hit this range on the nose with a 14 year term (5% > interest). It got the number from the 1624 Statute of Monopolies, > which limited royal monopolies to 14 years. In patent law, where the > economic disadvantages of too long a patent term are quite clear, most > patent offices have kept patent terms in the 14-20 year range, which > seems reasonable, looking at the table above. In copyright law, the > Continental jurists won the day, and things have gotten out of hand > ever since. > From shimmin at uiuc.edu Wed Jan 5 15:00:03 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Wed Jan 5 15:00:17 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DC681E.8060607@bohol.ph> References: <13e.9e84a12.2f0c58d7@aol.com> <41DB17FD.5060608@dsl.pipex.com> <41DB348B.4010406@uiuc.edu> <41DC681E.8060607@bohol.ph> Message-ID: <41DC7173.9030702@uiuc.edu> Jeroen Hellingman (Mailing List Account) wrote: > > I've been studying this idea a bit further, and think there is a fallacy > in this reasoning, that is, you are giving the author his present value > at publication, but you are not giving it at the present, but spread out > of N years.... Under your assumptions, he can consume A forever from > day zero with perpeptual copyright, but he will have to wait for N years > (and save everything, except interest) before he can consume A forever > after this deal. Since most people don't life forever, this isn't a very > good deal. > > I think you can still make a good economical case for a copyright > lasting in the order of 28 years, but that will require some more math, > and statistics about sales patterns of common copyrighted works. That's > why I wanted to throw in the VAT, or even some arbitrary valuation of > elements borrowed from the public domain, so that we can claim a work > for the public when the author has earned roughly 85 % of its value > according to very conservative estimates. The assumption is this style of scheme is that the creator should retain the right to a work as long as there remains significant value in the work. No matter where you draw the line, though, there exist some works of enduring value, which happen to be those works which are most cared about. While you would say, the collective value of free use of old ephemera is worth the value lost to the creators of enduring works, it could just as easily be said that the value lost to the creators of enduring works cannot possibly be worth the use of those works that have lost all value. The same assumption you start with can be used to justify perpetual copyright! If we instead say that the public should invest in a work the value of that work, but no more than the value of that work, then you reach my result. After N years, the value of the public's investment in the work, plus interest on that investment, has reached the value of that work. Causing the public to invest still more in the work is inefficient, because the public cannot both invest in the old work and new work. -- RS From jeroen.mailinglist at bohol.ph Wed Jan 5 15:45:11 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Wed Jan 5 15:45:12 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DC7173.9030702@uiuc.edu> References: <13e.9e84a12.2f0c58d7@aol.com> <41DB17FD.5060608@dsl.pipex.com> <41DB348B.4010406@uiuc.edu> <41DC681E.8060607@bohol.ph> <41DC7173.9030702@uiuc.edu> Message-ID: <41DC7C07.5000609@bohol.ph> My assumption is not "the author should get the value of the work," as that opens up the doors to perpetual copyright, but rather something like "the author should get a sufficient opportunity to try to earn back his initial investment." In your scheme, you claim to give him the present value (at publication date), but actually give it somewhat later, but then you can't maintain you're giving him present value. In my scheme, I would try to find a term that would not distract investors from investing in the production of a work, taking into consideration a number of factors, such as return on investment within a reasonable time (no investor will invest in a project with a break-even point more than 70 years in the future -- that is if he is thinking about the money), the distribution of earnings derived from a work, influence of competition from a vibrant public domain (if it can be shown that that really inhibits the creation of new works, not just stops the creation of "More of the same"); the debt every work has from borrowings from the public domain, and even personality rights (better known under a misnomer 'moral rights'), and even then I think publication + 28 years is very much more reasonable than the life + 70 we have to struggle with today. Once the public has paid the author what was necessary to enable him to create the work, I think the public should pay no more, as that will only disencourage the author to write yet another work (Why should he work if he is still getting money for nothing?), and disable us to distribute those same resources to other authors who also want to be able to write books. I am perfectly happy to give considerable margins, so that authors can actually earn a lot if their works have a big value to the public. Jeroen Hellingman. From j.hagerson at comcast.net Wed Jan 5 18:18:33 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Wed Jan 5 18:18:53 2005 Subject: [gutvol-d] Shakespeare in PG Message-ID: <002101c4f396$08e006c0$6401a8c0@sarek> I have copies of the Yale Shakespeare that were copyrighted in 1922. These are the individual blue-covered volumes for each play. Should I submit these for clearance and scanning? John Hagerson From pupeno at pupeno.com Wed Jan 5 23:08:22 2005 From: pupeno at pupeno.com (Pupeno) Date: Wed Jan 5 23:03:29 2005 Subject: [gutvol-d] DocBook Message-ID: <200501060408.25704.pupeno@pupeno.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I'm new to the Gutenberg project and I think it's awesome, so much information, literature, on-line, for free. The only thing I don't like is the format, plain text doesn't seem very good for storing books. Then some books are in HTML, that's even worst. I personally prefeer DocBook[1] for storing books. For those of you that doesn't know what DocBook is: it is an xml-based format designed exclusively for storing books where no information about how the book will be seen in a graphical screen is stored but information about what things are. It has tags for separating chapters, appendices, prefaces, parts, books (in sets of books), etc. This is not just a complain. I'd like to know if you'd be intrested in adopting DocBook, at least as one of the possible formats (being that DocBooks can be turned easily into HTML, PDF and Plain text [2], I would recomend DocBook as the main format); if so, I'm willing to donate some of my time to the cause in researching and developing a way to offer DocBooks (specially, how the current books can be turned into DocBook). If you'd like to know more about this, I already have some ideas but I didn't want to write a kilometer long email, just tell me and I'll elaborate more. Thank you. - -- Pupeno: pupeno@pupeno.com - http://www.pupeno.com Reading Science Fiction ? http://sfreaders.com.ar [1] http://www.docbook.org [2] Take a look to Science Fiction Reader's library: http://sfreaders.com.ar/library , all the files except the DocBooks are automatically created. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFB3OPpfW48a9PWGkURAmajAJ4wlsRAIR+bN5hnv8jZiB/pS1pbgwCfYElo v0dlFn9z+fjGXenXOWffI90= =4sRX -----END PGP SIGNATURE----- From joshua at hutchinson.net Thu Jan 6 05:20:07 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 05:20:09 2005 Subject: [gutvol-d] Re: German texts and the m-dash Message-ID: <20050106132007.4F3ECEDB50@ws6-1.us4.outblaze.com> In my experience, HTML files DO currently switch -- to — However, the text files use -- because the — entity equivalent doesn't exist in 7bit ASCII. I think I've seen this discussion before on DP forums. If I remember correctly, it was decided to stick to the xyz--xyz standard simply to avoid confusion and complication. Josh ----- Original Message ----- From: "Karl Eichwalder" To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] Re: German texts and the m-dash Date: Wed, 05 Jan 2005 21:24:48 +0100 > > Andrew Sly writes: > > > I remember clearly an exchange of email with a new white-washer > > about spaces around emdashes in a German text I was submitting. > > I was arguing that many other German texts in PG and other > > places seemed to have the spaces; he was arguing that the files > > should be prepared "to standard" before being submitted. > > Something along these line I read, too. But I thought the > post-processors or white-washer would use a special switch to prepare > German texts more like traditional German texts ;) For HTML I prefer > "xyz ? zyx" instead of "xyz--zyx". > > -- > http://www.gnu.franken.de/ke/ | ,__o > | _-\_<, > | (*)/'(*) > Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Thu Jan 6 05:22:48 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 05:22:50 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <20050106132248.F388AEDF10@ws6-1.us4.outblaze.com> Note: Not trying to be a smart@ss. Why do we need the bulwark specifically in an English-speaking country? Josh ----- Original Message ----- From: "Wallace J.McLean" > > We need a bulwark in the English-speaking world. We really need to set > up a PG-50 in Canada. Soon. Very, very soon. > > > > ----- Original Message ----- > > From Tony Baechler > Date Wed, 05 Jan 2005 06:41:56 -0800 > To traverso@dm.unipi.it, Project Gutenberg Volunteer Discussion > > Subject Re: [gutvol-d] PG-50/70? > > > Hi. Are the new posts from PG Europe going to be announced on > the "posted" > list as is currently done with books from PG of Australia? Also, what > about getting a gutenberg.eu or gutenberg.int domain? > > At 12:07 PM 1/5/2005 +0100, you wrote: > > Project Gutenberg Europe is starting at http://pge.rastko.net ; it is > > located in serbia, hence will work as life+50 soon: currently it is > > just a mirror of PG, new material from DP-EU will be uploaded quite > > soon. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Thu Jan 6 05:28:26 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 05:28:28 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure Message-ID: <20050106132826.C54B39E794@ws6-2.us4.outblaze.com> No disrespect, but ... That seems like a collosal waste of time. I could *perhaps* see a benefit to a directory structure that organized the texts by original paper publication date, but the date we got around to it? I really don't see a benefit there. Especially since as the years went on, we got "ahead" of the game and had things being posted in an entirely different time frame than what we were reporting them as being posted. Josh ----- Original Message ----- From: "Jared Buck" To: "Project Gutenberg Volunteer Discussion" Subject: [gutvol-d] Project Gutenberg Original Directory Structure Date: Wed, 5 Jan 2005 14:14:23 -0800 > > Hi all, It's Jared :) I'm a college student down here in California, and I've > been with PG for some time as a DP and more recently as one of the newsletter > editors. Prof. Hart and I have been talking about (by email) the possibility > of reviving the old directory structure (ie etext 03, etext94, etc) as perhaps > a classic PG website or at least an alternative site for users to browse > etexts the way they did before, with the etexts organized by year. It's the > structure I am most used to, and in discussions with Hart and Greg Newby, I > believe a project of this scale can be undertaken (to create a separate site > housing the etexts under the old directory structure. > > Prof. Hart also tells me there has been some discussion on this issue on the > mailing list recently, but I only just joined the gutvol list a few days ago. > Perhaps someone wouldn't mind filling me in on what I missed? > > Jared Buck > ---------------------- > Project Gutenberg editor http://www.gutenberg,net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Thu Jan 6 05:34:44 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 05:34:46 2005 Subject: [gutvol-d] DocBook Message-ID: <20050106133444.295F3EDF11@ws6-1.us4.outblaze.com> There is an ongoing effort to move to a master XML format, but instead of DocBook, we've tentatively chosen TEI (or rather, more specifically, a subset of TEI). If you'd like more information about it, please see this website: http://www.gutenberg.org/tei/ Also, if you have specific questions, I'd be happy to answer them to the best of my ability. Josh ----- Original Message ----- From: Pupeno To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] DocBook Date: Thu, 6 Jan 2005 04:08:22 -0300 > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > I'm new to the Gutenberg project and I think it's awesome, so much > information, literature, on-line, for free. > The only thing I don't like is the format, plain text doesn't seem very good > for storing books. Then some books are in HTML, that's even worst. I > personally prefeer DocBook[1] for storing books. For those of you that > doesn't know what DocBook is: it is an xml-based format designed exclusively > for storing books where no information about how the book will be seen in a > graphical screen is stored but information about what things are. It has tags > for separating chapters, appendices, prefaces, parts, books (in sets of > books), etc. > This is not just a complain. I'd like to know if you'd be intrested in > adopting DocBook, at least as one of the possible formats (being that > DocBooks can be turned easily into HTML, PDF and Plain text [2], I would > recomend DocBook as the main format); if so, I'm willing to donate some of my > time to the cause in researching and developing a way to offer DocBooks > (specially, how the current books can be turned into DocBook). If you'd like > to know more about this, I already have some ideas but I didn't want to write > a kilometer long email, just tell me and I'll elaborate more. > Thank you. > - -- > Pupeno: pupeno@pupeno.com - http://www.pupeno.com > Reading Science Fiction ? http://sfreaders.com.ar > > [1] http://www.docbook.org > [2] Take a look to Science Fiction Reader's library: > http://sfreaders.com.ar/library , all the files except the DocBooks are > automatically created. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.6 (GNU/Linux) > > iD8DBQFB3OPpfW48a9PWGkURAmajAJ4wlsRAIR+bN5hnv8jZiB/pS1pbgwCfYElo > v0dlFn9z+fjGXenXOWffI90= > =4sRX > -----END PGP SIGNATURE----- > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From hart at pglaf.org Thu Jan 6 06:33:24 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 06:33:26 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <20050106132826.C54B39E794@ws6-2.us4.outblaze.com> References: <20050106132826.C54B39E794@ws6-2.us4.outblaze.com> Message-ID: On Thu, 6 Jan 2005, Joshua Hutchinson wrote: > No disrespect, but ... > > That seems like a collosal waste of time. It's not your time, or much of our time, it is the time of voluneers who have asked to do this project, and there have been multiple requests. I'm sure many of the various projects within Project Gutenberg might seem like a waste to time to those not interested in them, or who have other projects going in other directions they would like to see more focus on. > I could *perhaps* see a benefit to a directory structure that organized the > texts by original paper publication date, but the date we got around to it? It is generally better to have more options than fewer, go for it! > I really don't see a benefit there. Especially since as the years went on, > we got "ahead" of the game and had things being posted in an entirely > different time frame than what we were reporting them as being posted. Just another option, for those who want it, no need for anyone who doesn't want it to be concerned. > > Josh Michael > > ----- Original Message ----- > From: "Jared Buck" > To: "Project Gutenberg Volunteer Discussion" > Subject: [gutvol-d] Project Gutenberg Original Directory Structure > Date: Wed, 5 Jan 2005 14:14:23 -0800 > >> >> Hi all, It's Jared :) I'm a college student down here in California, and I've >> been with PG for some time as a DP and more recently as one of the newsletter >> editors. Prof. Hart and I have been talking about (by email) the possibility >> of reviving the old directory structure (ie etext 03, etext94, etc) as perhaps >> a classic PG website or at least an alternative site for users to browse >> etexts the way they did before, with the etexts organized by year. It's the >> structure I am most used to, and in discussions with Hart and Greg Newby, I >> believe a project of this scale can be undertaken (to create a separate site >> housing the etexts under the old directory structure. >> >> Prof. Hart also tells me there has been some discussion on this issue on the >> mailing list recently, but I only just joined the gutvol list a few days ago. >> Perhaps someone wouldn't mind filling me in on what I missed? >> >> Jared Buck >> ---------------------- >> Project Gutenberg editor http://www.gutenberg,net >> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Thu Jan 6 06:39:54 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 06:39:56 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050106132248.F388AEDF10@ws6-1.us4.outblaze.com> References: <20050106132248.F388AEDF10@ws6-1.us4.outblaze.com> Message-ID: On Thu, 6 Jan 2005, Joshua Hutchinson wrote: > Note: Not trying to be a smart@ss. > > Why do we need the bulwark specifically in an English-speaking country? > > Josh 1. Because every country and every culture deserves to build it's own Project Gutenberg, sometimes more than just one Project Gutenberg. 2. Because Canada has NOT made it's copyright laws more restrictive over and over and over. 3. Specifically because last week Australia DID make it's copyright laws more restrictive, and is considering doing it even more. 4. Even more specifically to host Gone With The Wind, after the lawyers' onslaught about it recently. 5. Project Gutenberg of Canada also should encourage more French eBooks, as well as just in English. 6. The more Project Gutenbergs the better. . . . From joshua at hutchinson.net Thu Jan 6 06:40:53 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 06:40:55 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure Message-ID: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> ----- Original Message ----- From: "Michael Hart" > > > On Thu, 6 Jan 2005, Joshua Hutchinson wrote: > > > No disrespect, but ... > > > > That seems like a collosal waste of time. > > It's not your time, or much of our time, it is the time of voluneers who > have asked to do this project, and there have been multiple requests. > I agree. However, Jared was basically asking for comment on the idea, which I provided. Personally, I see it as effort that would be better spent elsewhere. I am the last person to dictate how a volunteer should spend his/her time, though (since I'm sure plenty of people see TEI as a waste of time as well, which is my pet project). I apologize, Jared, if I sounded like I was trying to stop you from working on this. I was simply stating my opinion of the project. Josh From hart at pglaf.org Thu Jan 6 07:07:35 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 07:07:37 2005 Subject: [gutvol-d] DocBook In-Reply-To: <200501060408.25704.pupeno@pupeno.com> References: <200501060408.25704.pupeno@pupeno.com> Message-ID: People should feel free to repost Project Gutenberg eBooks in any and all formats they wish. This was actually written into the PG header for ages, all the way back to EBCDIC v ASCII, etc. Michael On Thu, 6 Jan 2005, Pupeno wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > I'm new to the Gutenberg project and I think it's awesome, so much > information, literature, on-line, for free. > The only thing I don't like is the format, plain text doesn't seem very good > for storing books. Then some books are in HTML, that's even worst. I > personally prefeer DocBook[1] for storing books. For those of you that > doesn't know what DocBook is: it is an xml-based format designed exclusively > for storing books where no information about how the book will be seen in a > graphical screen is stored but information about what things are. It has tags > for separating chapters, appendices, prefaces, parts, books (in sets of > books), etc. > This is not just a complain. I'd like to know if you'd be intrested in > adopting DocBook, at least as one of the possible formats (being that > DocBooks can be turned easily into HTML, PDF and Plain text [2], I would > recomend DocBook as the main format); if so, I'm willing to donate some of my > time to the cause in researching and developing a way to offer DocBooks > (specially, how the current books can be turned into DocBook). If you'd like > to know more about this, I already have some ideas but I didn't want to write > a kilometer long email, just tell me and I'll elaborate more. > Thank you. > - -- > Pupeno: pupeno@pupeno.com - http://www.pupeno.com > Reading Science Fiction ? http://sfreaders.com.ar > > [1] http://www.docbook.org > [2] Take a look to Science Fiction Reader's library: > http://sfreaders.com.ar/library , all the files except the DocBooks are > automatically created. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.6 (GNU/Linux) > > iD8DBQFB3OPpfW48a9PWGkURAmajAJ4wlsRAIR+bN5hnv8jZiB/pS1pbgwCfYElo > v0dlFn9z+fjGXenXOWffI90= > =4sRX > -----END PGP SIGNATURE----- > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Thu Jan 6 07:09:25 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 07:09:26 2005 Subject: [gutvol-d] Shakespeare in PG In-Reply-To: <002101c4f396$08e006c0$6401a8c0@sarek> References: <002101c4f396$08e006c0$6401a8c0@sarek> Message-ID: On Wed, 5 Jan 2005, John Hagerson wrote: > I have copies of the Yale Shakespeare that were copyrighted in 1922. These > are the individual blue-covered volumes for each play. Should I submit these > for clearance and scanning? That would be great! Thanks!!! Michael From hacker at gnu-designs.com Thu Jan 6 07:09:23 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Jan 6 07:09:48 2005 Subject: [gutvol-d] DocBook In-Reply-To: References: <200501060408.25704.pupeno@pupeno.com> Message-ID: > People should feel free to repost Project Gutenberg eBooks in any > and all formats they wish. Does that include formats which require specialized readers, such as Microsoft Word, or mobile formats, such as Plucker? David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From hart at pglaf.org Thu Jan 6 07:24:16 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 07:24:17 2005 Subject: [gutvol-d] DocBook In-Reply-To: References: <200501060408.25704.pupeno@pupeno.com> Message-ID: On Thu, 6 Jan 2005, David A. Desrosiers wrote: > >> People should feel free to repost Project Gutenberg eBooks in any >> and all formats they wish. > > Does that include formats which require specialized readers, > such as Microsoft Word, or mobile formats, such as Plucker? Any format. . .there are PG eBooks out there in .lit, Plucker, and many other formats. . .no problem. Michael From joshua at hutchinson.net Thu Jan 6 07:30:58 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 07:31:01 2005 Subject: [gutvol-d] DocBook Message-ID: <20050106153058.5B9D44F517@ws6-5.us4.outblaze.com> You can repost them in ROT-13 for all we care. ;) However, what *we* will post in much more limited. Specialized readers are typically not supported on PG's website. Josh ----- Original Message ----- From: "David A. Desrosiers" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] DocBook Date: Thu, 6 Jan 2005 10:09:23 -0500 (EST) > > > > People should feel free to repost Project Gutenberg eBooks in any > > and all formats they wish. > > Does that include formats which require specialized readers, > such as Microsoft Word, or mobile formats, such as Plucker? > > David A. Desrosiers > desrod@gnu-designs.com > http://gnu-designs.com > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From jon at noring.name Thu Jan 6 07:37:18 2005 From: jon at noring.name (Jon Noring) Date: Thu Jan 6 07:37:27 2005 Subject: [gutvol-d] DocBook In-Reply-To: References: <200501060408.25704.pupeno@pupeno.com> Message-ID: <4391005156.20050106083718@noring.name> David Desrosiers wrote: > someone else wrote (please keep attributions -- people are important): >> People should feel free to repost Project Gutenberg eBooks in any >> and all formats they wish. > Does that include formats which require specialized readers, > such as Microsoft Word, or mobile formats, such as Plucker? Maybe it is time for PG to encourage (or require) that anything submitted should be in some kind of open standards format, not a proprietary format? Open standards formats include (from the top of my head): plain text (e.g., ASCII, UTF-8, UTF-16) marked up text using an open standard Schema/DTD (XML, SGML) TeX/LaTeX (I'm assuming they are open standards of some sort?) PDF/A (when ISO finalizes that spec, maybe later this year) Open Office Plucker etc. Formats which do not reach the threshold of "openness" include: Microsoft Word (and RTF which I think is controlled by MS) PDF (other than the upcoming PDF/A) LIT Mobipocket eReader/Palm/etc. Of course, following Michael's philosophy, the most preferred and durable formats are those which are based on text, which include plain (regularized) text, and XML documents. Even if PG decides not to require this (which I think they should to make a bold statement, and which does not really affect people's ability to submit works), PG should strongly encourage what is submitted to be based on open standards. Jon Noring From ke at gnu.franken.de Thu Jan 6 07:25:40 2005 From: ke at gnu.franken.de (Karl Eichwalder) Date: Thu Jan 6 07:38:55 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: <20050106132007.4F3ECEDB50@ws6-1.us4.outblaze.com> (Joshua Hutchinson's message of "Thu, 06 Jan 2005 08:20:07 -0500") References: <20050106132007.4F3ECEDB50@ws6-1.us4.outblaze.com> Message-ID: "Joshua Hutchinson" writes: > In my experience, HTML files DO currently switch -- to — Under those circumstances something went wrong with http://www.gutenberg.org/dirs/1/4/3/4/14340/14340-h/14340-h.htm . > However, the text files use -- because the — entity equivalent > doesn't exist in 7bit ASCII. That's okay. > I think I've seen this discussion before on DP forums. If I remember > correctly, it was decided to stick to the xyz--xyz standard simply to > avoid confusion and complication. I'm not sure whether the German reading community will get used to it ;-) -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C From joshua at hutchinson.net Thu Jan 6 07:43:29 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 07:43:34 2005 Subject: [gutvol-d] Re: German texts and the m-dash Message-ID: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> Inka (I'm assuming that's who did the HTML version) must not have used any of GutCutter-style tools BilFlis over at DP created! ;) -- to — is not a requirement, as far as I know, but it is something that I see most people do. I see no problem with replacing -- with in the HTML, though. It would be a very simply find/replace to do so, too. Josh ----- Original Message ----- From: "Karl Eichwalder" To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] Re: German texts and the m-dash Date: Thu, 06 Jan 2005 16:25:40 +0100 > > "Joshua Hutchinson" writes: > > > In my experience, HTML files DO currently switch -- to ? > > Under those circumstances something went wrong with > http://www.gutenberg.org/dirs/1/4/3/4/14340/14340-h/14340-h.htm . > > > However, the text files use -- because the ? entity equivalent > > doesn't exist in 7bit ASCII. > > That's okay. > > > I think I've seen this discussion before on DP forums. If I remember > > correctly, it was decided to stick to the xyz--xyz standard simply to > > avoid confusion and complication. > > I'm not sure whether the German reading community will get used to it ;-) > > -- > http://www.gnu.franken.de/ke/ | ,__o > | _-\_<, > | (*)/'(*) > Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From traverso at dm.unipi.it Thu Jan 6 08:14:01 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Thu Jan 6 08:11:30 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> (joshua@hutchinson.net) References: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> Message-ID: <200501061614.j06GE1e16389@posso.dm.unipi.it> > > In my experience, HTML files DO currently switch -- to — > > Under those circumstances something went wrong with > http://www.gutenberg.org/dirs/1/4/3/4/14340/14340-h/14340-h.htm . > > > However, the text files use -- because the — entity equivalent > > doesn't exist in 7bit ASCII. > > That's okay. > > > I think I've seen this discussion before on DP forums. If I remember > > correctly, it was decided to stick to the xyz--xyz standard simply to > > avoid confusion and complication. > > I'm not sure whether the German reading community will get used to it ;-) > Just to avoid to avoid complications ;-) a) \227 for em-dash is neither ascii nor iso-latin nor unicode: it is windows codepage. em-dash is — b) unicode has two slightly different characters, em-dash and horizontal bar, ― the latter is explicitly indicated in dialogues--that is where it is mostly used in german, french and italian, and other languages. It would make a lot of sense to use horizontal bar (that has some space around) and em-dash (without) where they are indicated. To forget en-dash – figure-dash ‒ etc. But, as I said, this a further level of complications (i.e. typographical precision) that is probably beyond our (present) reach. Carlo Traverso From Gutenberg9443 at aol.com Thu Jan 6 08:20:36 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Thu Jan 6 08:20:45 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <15.3ba34d31.2f0ebf54@aol.com> In a message dated 1/5/2005 4:45:20 PM Mountain Standard Time, jeroen.mailinglist@bohol.ph writes: Once the public has paid the author what was necessary to enable him to create the work This is not a flame. It's a careful exchange of views with whoever I'm quoting. Do I understand correctly that the fellow who works in the steel mill should be paid only what is necessary for the steel to be produced? And the waitress in a restaurant should be paid only enough for the orders to be taken and the food put on the table. Well, let's see, how much is the wear and tear on her shoe leather and her apron and the clothing she wars at the restaurant? For cryin' out loud, YOU CANNOT HAVE LABOR OF ANY KIND, WHITE COLLAR OR BLUE COLLAR, WITHOUT PAYING THE LABORER. Karl Marx and Jesus Christ agree on this, though they disagree as to how it should be done. It is 5:40 PM on a snowy day and we just called the plumber. Why should we have to pay him $35 just for coming here, before he even looks at the problem? Does it cost that much in gasoline? Surely it isn't the cost of the truck, because it's old enough that it's already paid for. I have been working for SIX YEARS on one book. At the moment it's about 200,000 words long. I have thrown away closer to two MILLION words that I wound up tossing and rewriting. Let's see, what is the cost of the paper . . . and the ink . . . and the computer . . . and the printer . . . Does that sum up the cost of writing the book? And I shouldn't get any more than that? WHY? I would make this 100-point type except that it would wind up perfectly ordinary type on other people's machines. Why should the publisher be paid . . . and the bookseller be paid . . . and the ink supplier be paid . . . and the paper supplier be paid . . . and the shipping companies be paid . . . and so on and so forth, for as long as people want to read a book, but the person who wrote the book should fall out of the loop and stop getting paid? Life plus 100, or life plus 70, or life plus 60, is absurd. Life plus 25 is not absurd. That's all I'm asking for. But too many people think I should get royalties for 10 years or 15 years and then no more, even when everybody else is still making money from the book. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050106/d2e516ed/attachment-0001.html From kth at srv.net Thu Jan 6 07:44:45 2005 From: kth at srv.net (Kevin Handy) Date: Thu Jan 6 08:22:51 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> Message-ID: <41DD5CED.8060605@srv.net> Joshua Hutchinson wrote: >----- Original Message ----- >From: "Michael Hart" > > >>On Thu, 6 Jan 2005, Joshua Hutchinson wrote: >> >> >> >>>No disrespect, but ... >>> >>>That seems like a collosal waste of time. >>> >>> >>It's not your time, or much of our time, it is the time of voluneers who >>have asked to do this project, and there have been multiple requests. >> >> >> > >I agree. However, Jared was basically asking for comment on the idea, which I provided. Personally, I see it as effort that would be better spent elsewhere. I am the last person to dictate how a volunteer should snpend his/her time, though (since I'm sure plenty of people see TEI as a waste of time as well, which is my pet project). > >I apologize, Jared, if I sounded like I was trying to stop you from working on this. I was simply stating my opinion of the project. > > Wouldn't it be easier to just create a web page that listed the original names in the original directory structure, and then linked to the current book, wherever it is. It wouldn't require as much space as a full copy of all the books, and would probably be easier to keep in sync with any updated files. From hacker at gnu-designs.com Thu Jan 6 08:27:13 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Jan 6 08:27:56 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DD5CED.8060605@srv.net> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> Message-ID: > Wouldn't it be easier to just create a web page that listed the > original names in the original directory structure, and then linked > to the current book, wherever it is. It wouldn't require as much > space as a full copy of all the books, and would probably be easier > to keep in sync with any updated files. Except when domains expire, sites go down, directory structures get moved around, and dozens of other situations where this is not the best approach, when you're relying on external sites to maintain their own content. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From pupeno at pupeno.com Thu Jan 6 08:12:21 2005 From: pupeno at pupeno.com (Pupeno) Date: Thu Jan 6 08:29:34 2005 Subject: [gutvol-d] DocBook In-Reply-To: <20050106133444.295F3EDF11@ws6-1.us4.outblaze.com> References: <20050106133444.295F3EDF11@ws6-1.us4.outblaze.com> Message-ID: <200501061312.23799.pupeno@pupeno.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Je ?a?do Januaro 6 2005 10:34, Joshua Hutchinson skribis: > There is an ongoing effort to move to a master XML format, but instead of > DocBook, we've tentatively chosen TEI (or rather, more specifically, a > subset of TEI). > > If you'd like more information about it, please see this website: > http://www.gutenberg.org/tei/ Why did you make your own DTD ? Instead of using the standard TEI one ? - -- Pupeno: pupeno@pupeno.com - http://www.pupeno.com Reading Science Fiction ? http://sfreaders.com.ar -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFB3WNnfW48a9PWGkURAu3DAJ0VOFoD1k3GuNQDUDsxDLXsN08TZQCfeWeH xf4V5XKwq6SWKDMSbeA4fLA= =UX41 -----END PGP SIGNATURE----- From joshua at hutchinson.net Thu Jan 6 08:34:16 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 08:34:24 2005 Subject: [gutvol-d] DocBook Message-ID: <20050106163416.503B1EDF26@ws6-1.us4.outblaze.com> Well, one reason was to make things more manageable... The full TEI spec is 1400 pages long! I certainly don't want to support everything in there. TEI-Lite, which is much more manageable, also leaves out a couple little things that we decided we needed. We also do a couple things a little differently that base TEI. (The markup for a force line break comes to mind here.) For a more technical reason, I leave it to Marcello to answer. Josh ----- Original Message ----- From: Pupeno To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] DocBook Date: Thu, 6 Jan 2005 13:12:21 -0300 > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Je ?a?do Januaro 6 2005 10:34, Joshua Hutchinson skribis: > > There is an ongoing effort to move to a master XML format, but instead of > > DocBook, we've tentatively chosen TEI (or rather, more specifically, a > > subset of TEI). > > > > If you'd like more information about it, please see this website: > > http://www.gutenberg.org/tei/ > > Why did you make your own DTD ? Instead of using the standard TEI one ? > > - -- > Pupeno: pupeno@pupeno.com - http://www.pupeno.com > Reading Science Fiction ? http://sfreaders.com.ar > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.6 (GNU/Linux) > > iD8DBQFB3WNnfW48a9PWGkURAu3DAJ0VOFoD1k3GuNQDUDsxDLXsN08TZQCfeWeH > xf4V5XKwq6SWKDMSbeA4fLA= > =UX41 > -----END PGP SIGNATURE----- > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From holden.mcgroin at dsl.pipex.com Thu Jan 6 08:57:49 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Thu Jan 6 08:57:57 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <15.3ba34d31.2f0ebf54@aol.com> References: <15.3ba34d31.2f0ebf54@aol.com> Message-ID: <41DD6E0D.1060809@dsl.pipex.com> Gutenberg9443@aol.com wrote: > Do I understand correctly that the fellow who works in the steel mill > should be paid only what is necessary for the steel to be produced? And > the waitress in a restaurant should be paid only enough for the orders > to be taken and the food put on the table. Well, let's see, how much is > the wear and tear on her shoe leather and her apron and the clothing she > wars at the restaurant? > > For cryin' out loud, YOU CANNOT HAVE LABOR OF ANY KIND, WHITE COLLAR OR > BLUE COLLAR, WITHOUT PAYING THE LABORER. Karl Marx and Jesus Christ > agree on this, though they disagree as to how it should be done. Firstly, I'd like to take issue with your analogies here. Creative work is not just like a regular job. Steelworkers and Waitresses, with very few exceptions, would not make steel or wait on people if they weren't paid to do it. They do what they do because somebody is willing to pay them a price for which they are willing to give up their labour. I'd not argue that some people in creative industries do work for similar reasons as your steelworker and waitress above. However, I would argue that the vast majority of people create not because they make a living that way but because they enjoy the act of creation. Every day, I meet people who write books (even ones that I'd consider worth publishing) but never send off a copy to any publishers or agents. Every day, I meet people who write, perform and record music not because they're secretly hoping for a contract from the record industry but because they enjoy making music and they enjoy that they can give people a little entertainment. > Life plus 100, or life plus 70, or life plus 60, is absurd. Life plus 25 > is not absurd. That's all I'm asking for. But too many people think I > should get royalties for 10 years or 15 years and then no more, even > when everybody else is still making money from the book. This plumber you gave your hard-earned money to, how many years will you be sending royalty cheques to him? Even if authors, musicians, artists and programmers were given no money to create, there would still be a large amount of creation going on. The original idea behind copyright (take a look in the U.S. constitution) is not to ensure that creators get paid for the rest of their natural lives (or, for that matter, until long after they're dead). Its purpose is solely to increase the level of creation that goes on. However, that goal has been subverted. Copyright is no longer designed to reward artists for creation. It's designed solely to give lengthy monopolies to corporations. Cheers, Holden From ag737 at freenet.carleton.ca Thu Jan 6 08:57:54 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Thu Jan 6 08:58:00 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <10d96c10e0be.10e0be10d96c@ncf.ca> > Note: Not trying to be a smart@ss. > > Why do we need the bulwark specifically in an English-speaking country? > > Josh A) Because without one, the entire Commonwealth is going to fall to the life+70 empire. B) Because the sooner we get started on a Canadian life+50 project, the more harm we can demonstrate to policy-makers and potential allies in the broader community, when the full-court press for life+70 is brought down. > 2.Because Canada has NOT made it's copyright laws more restrictive > over and over and over. The wimps who make our copyright policy are under EXTREME pressure to extend the term, and, contrary to what you write, MH, have made some incredibly stupid and restrictive policy decisions in respect of other CR matters, other than term, in the past little while. > 4. Even more specifically to host Gone With The Wind, after the > lawyers' onslaught about it recently. And also, if we can hold off till January 1 2007, the A.A. Milne corpus. > 5. Project Gutenberg of Canada also should encourage more French eBooks, > as well as just in English. Absolument! There's already a surprisingly robust PD movement in Quebec, and many of the projects I've created on PGDP have been in French, with more to come. The source scans from other sites (canadiana.org, BNQ, etc.) also provide ample French-language content to work from. > 6. The more Project Gutenbergs the better. . . . So when do we -- whoever WE is -- get PG-Canada, under life+50 rules, up and running? From kth at srv.net Thu Jan 6 08:30:09 2005 From: kth at srv.net (Kevin Handy) Date: Thu Jan 6 09:08:18 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> Message-ID: <41DD6791.7020200@srv.net> David A. Desrosiers wrote: >>Wouldn't it be easier to just create a web page that listed the >>original names in the original directory structure, and then linked >>to the current book, wherever it is. It wouldn't require as much >>space as a full copy of all the books, and would probably be easier >>to keep in sync with any updated files. >> >> > > Except when domains expire, sites go down, directory >structures get moved around, and dozens of other situations where this >is not the best approach, when you're relying on external sites to >maintain their own content. > > > I would think it would be as a part of the Gutenberg site, not on a seperate site. That way, all links would be relative. Any broken links would be because of a change in the master directory structure, or an update to one of the books, which would need to be handled anyway (I'm assuming you still want the latest versions of the books). If you could add the original ebook number/directory to the metadata stored at Gutenberg, then you could periodically re-generate the web page(s) automatically with a simple perl script. Or maybe just use a CGI script to create it on-the-fly, so it is automatic. If you want it as a seperate site, write the links to point at whichever mirror you want. If, however, you want a static copy of the site up to when it switched to the new format, ignoring all new and updated books; then a seperate site would probably be preferable. Or just grab a copy of the 10K special DVD, as it has the original directory structure, and mount it in a web-accessable way. From hacker at gnu-designs.com Thu Jan 6 09:18:35 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Jan 6 09:18:52 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DD6791.7020200@srv.net> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> Message-ID: > I would think it would be as a part of the Gutenberg site, not on a > seperate site. That way, all links would be relative. Any broken > links would be because of a change in the master directory > structure, or an update to one of the books, which would need to be > handled anyway (I'm assuming you still want the latest versions of > the books). That doesn't help the problem at all, because what do you do with any images that may be used in the work, such as a DocBook copy of a book or an HTML version of a book? Do you symlink those across the tree also? This is a management nightmare, especially if things move around in the tree (as they are now). It also doesn't remove the space constraints of having the full copy of the work in multiple formats. With the sheer size of the Gutenberg tree, this will rapidly become a full-time job to make sure everything is working right without breakage with thousands of symlinks all over the tree. > If you want it as a seperate site, write the links to point at > whichever mirror you want. Links don't point to remote servers, they point to local resources. Unless of course, these are replicated across some local filesystem and rsync'd from there. > If, however, you want a static copy of the site up to when it > switched to the new format, ignoring all new and updated books; then > a seperate site would probably be preferable. Or just grab a copy of > the 10K special DVD, as it has the original directory structure, and > mount it in a web-accessable way. That DVD is pretty old at this point, and doesn't include the new directory structure, if I remember correctly. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From kth at srv.net Thu Jan 6 09:04:35 2005 From: kth at srv.net (Kevin Handy) Date: Thu Jan 6 09:42:43 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> Message-ID: <41DD6FA3.9030802@srv.net> David A. Desrosiers wrote: >>switched to the new format, ignoring all new and updated books; then >>a seperate site would probably be preferable. Or just grab a copy of >>the 10K special DVD, as it has the original directory structure, and >>mount it in a web-accessable way. >> >> > If, however, you want a static copy of the site up to when it > > > That DVD is pretty old at this point, and doesn't include the >new directory structure, if I remember correctly. > > Isn't that what this discussion is all about? Going back to the old directory structure? I was just trying to clarify exactly to what level they wanted to go back. I must be missing something here. The DVD was created just before the new structure was started. The new structure went in around ebook 10000, and the DVD was created around ebook 9500 (so mising something under 500 books before the new structure went in), so it is almost exactly what I understood was wanted. From jonathan_ingram at yahoo.com Thu Jan 6 09:49:36 2005 From: jonathan_ingram at yahoo.com (Jonathan Ingram) Date: Thu Jan 6 09:49:42 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <41DD6E0D.1060809@dsl.pipex.com> Message-ID: <20050106174936.69198.qmail@web41728.mail.yahoo.com> --- Holden McGroin wrote: > However, that goal has been subverted. Copyright is no longer designed to > reward artists > for creation. It's designed solely to give lengthy monopolies to > corporations. Indeed. Life+X copyright terms, in general, *are* absurd. Under the life+70 regime in my country, if I publish a book when I'm 20, and live until 90, that book will be copy restricted for 140 years! This isn't a wild fantasy scenario, either -- recently on DP we proofread a book published in 1880, written by an author who died in 1933 (and which therefore only become copy restriction free at the beginning of last year). -- Jon Ingram __________________________________ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com From jonathan_ingram at yahoo.com Thu Jan 6 09:55:52 2005 From: jonathan_ingram at yahoo.com (Jonathan Ingram) Date: Thu Jan 6 09:55:58 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <10d96c10e0be.10e0be10d96c@ncf.ca> Message-ID: <20050106175552.3154.qmail@web41721.mail.yahoo.com> --- "Wallace J.McLean" wrote: > So when do we -- whoever WE is -- get PG-Canada, under life+50 rules, > up and running? Through DPEU I've already scanned and processed over 30 works which are public domain in life+70 (and hence life+50) countries, but not in the USA. Currently I have nowhere in the PG universe to publish them except PG-Australia, and I don't regard that as a viable long-term option -- at least, until I'm told otherwise. Even if PG Australia is happy to take my books, at the moment it's the only outlet for life+X texts, and we've seen with the Gone With The Wind fiasco the problems that can cause. We need a centralised PGUS-style storage system for life+X books that can be easily mirrored in other life+X countries, and that should preferably be based in the country with the most relaxed laws -- Canada. -- Jon Ingram __________________________________ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 From hart at pglaf.org Thu Jan 6 10:51:27 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 6 10:51:28 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> Message-ID: On Thu, 6 Jan 2005, David A. Desrosiers wrote: > >> I would think it would be as a part of the Gutenberg site, not on a >> seperate site. That way, all links would be relative. Any broken >> links would be because of a change in the master directory >> structure, or an update to one of the books, which would need to be >> handled anyway (I'm assuming you still want the latest versions of >> the books). > > That doesn't help the problem at all, because what do you do > with any images that may be used in the work, such as a DocBook copy > of a book or an HTML version of a book? Do you symlink those across > the tree also? This is a management nightmare, especially if things > move around in the tree (as they are now). > > It also doesn't remove the space constraints of having the > full copy of the work in multiple formats. With the sheer size of the > Gutenberg tree, this will rapidly become a full-time job to make sure > everything is working right without breakage with thousands of > symlinks all over the tree. We are going to have that many links one of these days, anyway. >> If you want it as a seperate site, write the links to point at >> whichever mirror you want. > > Links don't point to remote servers, they point to local > resources. Unless of course, these are replicated across some local > filesystem and rsync'd from there. Actually, anyone is free to mount these on any servers they like, as long as they are given away free of all charges. >> If, however, you want a static copy of the site up to when it >> switched to the new format, ignoring all new and updated books; then >> a seperate site would probably be preferable. Or just grab a copy of >> the 10K special DVD, as it has the original directory structure, and >> mount it in a web-accessable way. > > That DVD is pretty old at this point, and doesn't include the > new directory structure, if I remember correctly. I think that was the point. . . . mh From gbnewby at pglaf.org Thu Jan 6 11:38:36 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jan 6 11:38:37 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050106175552.3154.qmail@web41721.mail.yahoo.com> References: <10d96c10e0be.10e0be10d96c@ncf.ca> <20050106175552.3154.qmail@web41721.mail.yahoo.com> Message-ID: <20050106193836.GA18249@pglaf.org> On Thu, Jan 06, 2005 at 09:55:52AM -0800, Jonathan Ingram wrote: > --- "Wallace J.McLean" wrote: > > So when do we -- whoever WE is -- get PG-Canada, under life+50 rules, > > up and running? > > Through DPEU I've already scanned and processed over 30 works which are public > domain in life+70 (and hence life+50) countries, but not in the USA. Currently > I have nowhere in the PG universe to publish them except PG-Australia, and I > don't regard that as a viable long-term option -- at least, until I'm told > otherwise. Even if PG Australia is happy to take my books, at the moment it's > the only outlet for life+X texts, and we've seen with the Gone With The Wind > fiasco the problems that can cause. We need a centralised PGUS-style storage > system for life+X books that can be easily mirrored in other life+X countries, > and that should preferably be based in the country with the most relaxed laws > -- Canada. > > -- > Jon Ingram Jon, I think the answer might be PG-EU. Have you talked with Zoran or Branco or any of the other folks involved? If they're not ready yet, I can talk with my buds at xs4all and maybe work on getting a server there. This is life+70, not life+50. There's a pg-eu mailing list I can put you in touch with, if you'd like. Canada is the only life+50 country left I can think of that's "in the works," and they don't have a server yet. If you have a set of documents, maybe it will help to move things along. The pgcanada list is on http://lists.pglaf.org -- Greg From jmdyck at ibiblio.org Thu Jan 6 11:44:04 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Thu Jan 6 11:46:46 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: <005001c4f373$eaf1a4b0$9bc9c7ac@jared> Message-ID: <41DD9504.C19954A5@ibiblio.org> Jared Buck wrote: > > Prof. Hart and I have been talking about (by email) the possibility > of reviving the old directory structure (ie etext 03, etext94, etc) Revive it? It still seems to be alive. E.g., http://www.gutenberg.org/dirs/etext03/ http://www.gutenberg.org/dirs/etext94/ But I suppose these are unsatisfactory for your purposes because they're gradually dwindling, as their texts are (refurbished and) reposted under the new directory structure. Eventually, they'll be empty. > as perhaps a classic PG website or at least an alternative site > for users to browse etexts the way they did before, with the etexts > organized by year. So: 1) Would this site exclude the post-10k texts (which never had a home in the old structure)? 2) Would it maintain separate copies of the texts, or merely simulate the old structure, providing hyperlinks into the new structure (or the new catalog) for the actual texts? 3) If it maintains separate copies, would it track changes to the corresponding texts in the new structure? > Prof. Hart also tells me there has been some discussion on this > issue on the mailing list recently, but I only just joined the > gutvol list a few days ago. Perhaps someone wouldn't mind filling > me in on what I missed? Hm. If he's thinking of the gutvol-d mailing list, and within the last few months, then the only thing I can find that's somewhat relevant is the discussion re Folio files. See the archives for December. (Click the link in the boilerplate at the bottom of the message.) > Jared Buck > ---------------------- > Project Gutenberg editor http://www.gutenberg,net (You might want to change that comma to a dot.) -Michael From jonathan_ingram at yahoo.com Thu Jan 6 12:36:59 2005 From: jonathan_ingram at yahoo.com (Jonathan Ingram) Date: Thu Jan 6 12:37:07 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050106193836.GA18249@pglaf.org> Message-ID: <20050106203659.42713.qmail@web41724.mail.yahoo.com> --- Greg Newby wrote: > Jon, I think the answer might be PG-EU. Have you talked with Zoran or > Branco or any of the other folks involved? If they're not ready yet, > I can talk with my buds at xs4all and maybe work on getting a server > there. This is life+70, not life+50. There's a pg-eu mailing list I > can put you in touch with, if you'd like. > > Canada is the only life+50 country left I can think of that's "in the > works," and they don't have a server yet. If you have a set of > documents, maybe it will help to move things along. The pgcanada list > is on http://lists.pglaf.org I'd love to know more about PG-EU, so please feel free to forward information about the mailing list to me. We get occasional progress reports on PGEU through posts on the DPEU forum, but I've not seen anything recently. I'm looking forward to it starting up soon... Even when it does start, though, the last part of my message still remains to be looked at. We need to make sure that all these life+X projects share their material, in such a way that people can download the whole collection just as easily as they can download the PG-US material. It's going to be harder than PG-US to manage, because the life+X countries (ignoring Australia) have a living public domain rather than the freeze-dried public domain of the USA, but it's something we'll have to sort out eventually. -- Jon Ingram __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com From Gutenberg9443 at aol.com Thu Jan 6 13:05:36 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Thu Jan 6 13:05:51 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <60.4c704eae.2f0f0220@aol.com> In a message dated 1/6/2005 9:58:30 AM Mountain Standard Time, holden.mcgroin@dsl.pipex.com writes: I'd not argue that some people in creative industries do work for similar reasons as your steelworker and waitress above. However, I would argue that the vast majority of people create not because they make a living that way but because they enjoy the act of creation. Every day, I meet people who write books (even ones that I'd consider worth publishing) but never send off a copy to any publishers or agents. Every day, I meet people who write, perform and record music not because they're secretly hoping for a contract from the record industry but because they enjoy making music and they enjoy that they can give people a little entertainment. Goodie for them. I WRITE FULLTIME. Making a living is important to me. I am no longer physically capable of doing any other work. And I am sick and tired of this patronizing attitude from somebody who knows absolutely nothing whatever of the kind of work I do or the fact that it is extremely HARD work. If I gave a complete answer to this I would set fire to the internet. So I will simply say that you are totally and completely out of your mind. It is quite obvious that you hate anybody who wants to make a living in the arts. As long as YOU get paid for whatever YOUR paid work is, you care far too little what happens to anybody else. Knowing somebody who writes books is not the same as writing books yourself. When I was a teacher, I got paid monthly. When I was selling telephone systems to businesses, I got paid biweekly. When I was a cop, I got paid biweekly. As a writer, I get paid whenever some editor decides to get up off her butt and send me a check. If my book sits on her desk for a year because she's too lazy to open it and look at the first three pages, which is all it takes to say "No, I don't want this," that's my problem, as is the fact that most editors won't look at simultaneous proposals at all. Greg, I'm sorry. I know you like me to stay current with what's going on in the mailing lists. But I'm taking four blood-pressure meds already. I'm bailing out. I will do all the things that I am committed to doing, and you know what they are. But I'm not going to read this crap anymore. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050106/ce09f8b3/attachment.html From joshua at hutchinson.net Thu Jan 6 13:47:12 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 6 13:47:21 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <20050106214712.7D49B9E8DA@ws6-2.us4.outblaze.com> Anne, you really need to calm down when someone has a different opinion than you. Your blood pressure will thank you for it! :) That being said, I agree with you on needing copyright. If I wrote something for profit, I'd want to actually get a profit for it. And, like you, I think that Life+75 is ridiculus. I think Life+25 is about the max *I* would personally set it at. Though I don't really like Life+ scenarios. I prefer a set limit after publication (yes, like the US, but nothing approaching the ridiculous copyright length we have now). The original copyright in the US was 14 years (as others have pointed out previously), but there was a provision to extend it another 14 years if you applied for the extension. So, books were protected for 28 years if they were in print. I think my perfect length would be 50 years from publication. That's plenty of time for you (and most likely your heirs) to make all the money you are likely to off the work, while still being somewhat relevant once it hits public domain. Longer than that and obscure works start becoming extremely hard to find and we at PG don't even GET the opportunity to preserve it for history's sake. Josh ----- Original Message ----- From: Gutenberg9443@aol.com > > In a message dated 1/6/2005 9:58:30 AM Mountain Standard Time, > holden.mcgroin@dsl.pipex.com writes: > > I'd not argue that some people in creative industries do work for similar > reasons as your > steelworker and waitress above. However, I would argue that the vast > majority of people > create not because they make a living that way but because they enjoy the > act of creation. > Every day, I meet people who write books (even ones that I'd consider worth > publishing) > but never send off a copy to any publishers or agents. Every day, I meet > people who write, > perform and record music not because they're secretly hoping for a contract > from the > record industry but because they enjoy making music and they enjoy that they > can give > people a little entertainment. > > > > > Goodie for them. I WRITE FULLTIME. Making a living is important to me. I am > no longer physically capable of doing any other work. And I am sick and tired > of this patronizing attitude from somebody who knows absolutely nothing > whatever of the kind of work I do or the fact that it is extremely HARD work. > > If I gave a complete answer to this I would set fire to the internet. So I > will simply say that you are totally and completely out of your mind. It is > quite obvious that you hate anybody who wants to make a living in the arts. As > long as YOU get paid for whatever YOUR paid work is, you care far too little > what happens to anybody else. Knowing somebody who writes books is not the > same as writing books yourself. > > When I was a teacher, I got paid monthly. When I was selling telephone > systems to businesses, I got paid biweekly. When I was a cop, I got paid > biweekly. > As a writer, I get paid whenever some editor decides to get up off her butt > and send me a check. If my book sits on her desk for a year because she's too > lazy to open it and look at the first three pages, which is all it takes to > say "No, I don't want this," that's my problem, as is the fact that most > editors won't look at simultaneous proposals at all. > > Greg, I'm sorry. I know you like me to stay current with what's going on in > the mailing lists. But I'm taking four blood-pressure meds already. I'm > bailing out. I will do all the things that I am committed to doing, and you > know > what they are. But I'm not going to read this crap anymore. > > Anne > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From marcello at perathoner.de Thu Jan 6 13:00:40 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jan 6 14:01:11 2005 Subject: [gutvol-d] DocBook In-Reply-To: <200501061312.23799.pupeno@pupeno.com> References: <20050106133444.295F3EDF11@ws6-1.us4.outblaze.com> <200501061312.23799.pupeno@pupeno.com> Message-ID: <41DDA6F8.9080205@perathoner.de> Pupeno wrote: > Why did you make your own DTD ? Instead of using the standard TEI one ? Because this is the way TEI is supposed to be used. Nobody uses the full TEI DTD as basis for a markup project. You decide which subset of TEI you need for your project and generate a DTD (yourself or with a tool called the Pizza Chef). There is even a standard way to extend TEI if the full TEI doesn't contain the tags you want. I used this standard way to add some presentational tags, to help the automatic generation of file formats. Still my conversion tools will handle almost any standard TEI-lite file. See: www.gutenberg.org/tei/ -- Marcello Perathoner webmaster@gutenberg.org From blondeel at clipper.ens.fr Thu Jan 6 14:05:40 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Thu Jan 6 14:05:51 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <60.4c704eae.2f0f0220@aol.com> References: <60.4c704eae.2f0f0220@aol.com> Message-ID: <20050106220540.GA1223@clipper.ens.fr> On Thu, Jan 06, 2005 at 04:05:36PM -0500, Gutenberg9443@aol.com wrote: > In a message dated 1/6/2005 9:58:30 AM Mountain Standard Time, > holden.mcgroin@dsl.pipex.com writes: > > On Thu, 6 Jan 2005 at 11:20:36 EST, Gutenberg9443@aol.com wrote: > > > For cryin' out loud, YOU CANNOT HAVE LABOR OF ANY KIND, WHITE COLLAR > > > OR BLUE COLLAR, WITHOUT PAYING THE LABORER. Karl Marx and Jesus Christ > > > agree on this, though they disagree as to how it should be done. > > > > I'd not argue that some people in creative industries do work for > > similar reasons as your steelworker and waitress above. However, I > > would argue that the vast majority of people create not because they > > make a living that way but because they enjoy the act of creation. > > Every day, I meet people who write books (even ones that I'd consider > > worth publishing) but never send off a copy to any publishers or > > agents. Every day, I meet people who write, perform and record music not > > because they're secretly hoping for a contract from the record industry > > but because they enjoy making music and they enjoy that they can give > > people a little entertainment. > > Goodie for them. I WRITE FULLTIME. Making a living is important to me. I > am no longer physically capable of doing any other work. And I am sick > and tired of this patronizing attitude from somebody who knows > absolutely nothing whatever of the kind of work I do or the fact that > it is extremely HARD work. You two seem to get passionate, argue irrationally, and mix different kinds of problems here. This bogus (to me) argument often appears when people complain about the development of the sharing of material on P2P systems and the like: "artists (Artists!) need to be able to make a living, etc.". Either you are some kind of modern slave, and you do a job you don't like because you need the money: a contract promises in advance some money to you in exchange for your work. This can be blue-collar work or some command-blue-collar work, even creative works. Either you like what you are doing, and you would do it anyway (or if not, you are willing to take the risk to be successful or not). I agree most waitresses and steelworkers wouldn't do their job if they didn't need the money. Many people are unlucky enough to have to do a job they hate (or don't like) for this reason. I agree some (most?) creators would create (nearly as much or as well) no matter what. But some time creators and artists who advocate copyright laws to ensure their living give, in such discussions, the feeling that they want the society at large to pay them for their creation no matter how good or bas it is. A simple criterium (and maybe, the only one found until now, no matter how imperefct) is public success: either you work sells, and you are entitled to money, either you fail to please many, and why demand money for your work? Get a public or cultural institution to sign you up a contract before you start working on this obscure field, or forget the idea to be paid for it! Anne, you write fulltime, with no contract warranties. In the present situation, you make make ends meet and make a living with that. If copyright laws get less harsh as some people would like them to, you are afraid the balance will shift and you will not be able to make a living writing any more. Fine. So what? Laws are not supposed to be designed to give anybody a way of living; they should serve the public good and make life better for most. Many people would like to do nothing but their pet hobby and be paid for it, and can't do it. Maybe the society would be better off with shorter copyright terms, even though that would prevent some creators (like you) to live like they do now. If tomorrow copyright laws change and the new situation makes it impossible for you to keep making a living doing what you do now, you will know in advance. Whenever you choose to work on a new book, you will know what protection and money you can expect from laws and society (market, editors power, etc.). If that money is the only reason for you to create, then you won't create any more. Nobody will have lied to you or stolen you. This is the idea of non retroactivity of law. I guess (most) people who advocate changes in copyright law don't ask to change the law with retroaction. If tomorrow some brave governement listens to them and passes a law taking copyright protection down to 14 years or so, there will still be a big black cultural hole in the XXth century. Unless something really hard happens, like a revolution or a war (after all, copyright is one of the first industries in the world now, many wars have been fought on less than that). Even like that, your grand-children should be able to sue for breach of retroativity (unless some non-democratic governement takes place for long enough for all of them or for the case to be inwinnable). And if in the meantime you and your children will have been deprived of your "legitimate rights" because there was a war, a dictatorship or whatever, well, I can't see how to avoid brute force to prevail. This happened to many. When Michael Hart spoke in the offices of the French National Assembly last year, somebody asked him what would be *his* ideal copyright laws. He replied this: http://quatramaran.ens.fr/~blondeel/conf/2004-02-13-Hart-AssembleeNationale.txt -=-=-= Question: In their place, what copyright laws would you have set up? Answer: My proposal is to design a program to predict sales curves. Database(?): when any book has sold, copyright expires. When a book is out of print, copyright expires. And print on demand dosn't count, we are not playing games here: the book has to be on a frequent shelf. That would give them the lion's share of the profit. Sales fall off fast. In 5, 10, 15, 20... 25 years, 99% of books have sold all they are ever gonna sell. After 5 years, 50% of books are out of print. -=-=-= (note: go up to the directory to find the speech he gave in UNESCO) From shimmin at uiuc.edu Thu Jan 6 14:45:50 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Thu Jan 6 14:46:00 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050106220540.GA1223@clipper.ens.fr> References: <60.4c704eae.2f0f0220@aol.com> <20050106220540.GA1223@clipper.ens.fr> Message-ID: <41DDBF9E.3090509@uiuc.edu> > Question: In their place, what copyright laws would you have set up? > > Answer: My proposal is to design a program to predict sales curves. > > Database(?): when any book has sold, copyright expires. > When a book is out of print, copyright expires. > And print on demand dosn't count, we are not playing games here: the > book has to be on a frequent shelf. > > That would give them the lion's share of the profit. Sales fall off > fast. > > In 5, 10, 15, 20... 25 years, 99% of books have sold all they are ever > gonna sell. After 5 years, 50% of books are out of print. Use-it-or-lose-it copyright is an interesting idea, but one that was born too late, for reasons suggested above. With traditional distribution channels, a product being in print was a good surrogate for a product still having commercial value, because the costs of keeping something in print ensured that it was foolhardy to keep something in print unless it still had commercial value. And this applied to other media as well as to print. But with modern distribution methods, like print-on-demand, it costs little to keep something in print indefinitely. And this can only be a good thing for all concerned. Why should print-on-demand be excluded, other than that it wrecks a use-it-or-lose-it copyright scheme? For books with expected runs of only a few thousand copies, it is already cost-competitive as the primary means of distribution, and its range of applicability can only increase. If someone finally gets around to producing a PDA that's as much joy to read as a printed book, and ebooks become a major part of the publishing industry, we may see titles that were only ever available via download. In the music business, there are already titles that have only ever been available via mp3.com or iTunes. I forsee more and more media being distributed via means such that the concept of being "in print" becomes increasingly obsolete. A scheme to adapt to this changing world might work as follows: When a copyright is registered (and while you don't have to register to have one, you do have to register to have one worth enforcing) the rights holder assesses a value for the work. They are free to change this value according to market conditions. As real property is taxed based on its assessed value, intellectual property would also be taxed on its assessed value. Meanwhile, the assessed value can be used to set prices in a compulsory licensing system. When a work no longer has comercial value, it behooves the rights holder to devalue it to the point where public use is essentially free, or be assessed tax on something that is no longer valuable to them. -- RS From JBuck814366460 at aol.com Thu Jan 6 15:40:51 2005 From: JBuck814366460 at aol.com (Jared Buck) Date: Thu Jan 6 15:41:18 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> Message-ID: <002301c4f449$29bd0560$4c69c7ac@jared> No offense, Josh, this is a free country and you are welcome to provide your opinion as you see fit :) I'm trying to start a PG of Russia, I just got to get in touch with my girlfriend in Moscow, she'd be interested in helping once she's out of school later this month :) Prof. Hart and Greg Newby would like to see a Russian PG, and I have the time and resources to spend to work on one. Jared Buck ---------------------- Project Gutenberg editor http://www.gutenberg,net From JBuck814366460 at aol.com Thu Jan 6 15:44:03 2005 From: JBuck814366460 at aol.com (Jared Buck) Date: Thu Jan 6 15:44:18 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> Message-ID: <004401c4f449$9bc51170$4c69c7ac@jared> That sounds more like what i had in mind :) I want it to be simple and easy to navigate, and yes, we'd save server space by simply linking to the file(s) wherever they currently are. Jared Buck ---------------------- Project Gutenberg editor http://www.gutenberg,net ----- Original Message ----- From: "Kevin Handy" To: "Project Gutenberg Volunteer Discussion" Sent: Thursday, January 06, 2005 7:44 AM Subject: Re: [gutvol-d] Project Gutenberg Original Directory Structure > Joshua Hutchinson wrote: > >>----- Original Message ----- >>From: "Michael Hart" >> >>>On Thu, 6 Jan 2005, Joshua Hutchinson wrote: >>> >>> >>>>No disrespect, but ... >>>> >>>>That seems like a collosal waste of time. >>>> >>>It's not your time, or much of our time, it is the time of voluneers who >>>have asked to do this project, and there have been multiple requests. >>> >>> >> >>I agree. However, Jared was basically asking for comment on the idea, >>which I provided. Personally, I see it as effort that would be better >>spent elsewhere. I am the last person to dictate how a volunteer should >>snpend his/her time, though (since I'm sure plenty of people see TEI as a >>waste of time as well, which is my pet project). >> >>I apologize, Jared, if I sounded like I was trying to stop you from >>working on this. I was simply stating my opinion of the project. >> > Wouldn't it be easier to just create a web page that listed the original > names > in the original directory structure, and then linked to the current book, > wherever it is. It wouldn't require as much space as a full copy of all > the > books, and would probably be easier to keep in sync with any updated > files. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From sly at victoria.tc.ca Thu Jan 6 15:45:38 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jan 6 15:45:53 2005 Subject: [gutvol-d] DocBook In-Reply-To: <200501061312.23799.pupeno@pupeno.com> References: <20050106133444.295F3EDF11@ws6-1.us4.outblaze.com> <200501061312.23799.pupeno@pupeno.com> Message-ID: On Thu, 6 Jan 2005, Pupeno wrote: > > Why did you make your own DTD ? Instead of using the standard TEI one ? > The whole design of TEI is based on the idea that it can be taken and adpated for local use in a vast range of different projects. It is not intended to be ready to use "out-of-the-box". Andrew From JBuck814366460 at aol.com Thu Jan 6 15:51:46 2005 From: JBuck814366460 at aol.com (Jared Buck) Date: Thu Jan 6 15:52:04 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: <005001c4f373$eaf1a4b0$9bc9c7ac@jared> <41DD9504.C19954A5@ibiblio.org> Message-ID: <006e01c4f44a$afcb73c0$4c69c7ac@jared> > So: > 1) Would this site exclude the post-10k texts (which never had a home > in the old structure)? > 2) Would it maintain separate copies of the texts, or merely simulate > the old structure, providing hyperlinks into the new structure (or > the new catalog) for the actual texts? > 3) If it maintains separate copies, would it track changes to the > corresponding texts in the new structure? 1. Yes, it would likely exclude the post-10k texts but would link to the newer versions posted once we reached the 10-k format. 2. I'm not sure which way we're gonna go, separate text copies sounds the best way to me, with links to the newer versions so users always know where they can find the latest version of a particular etext. 3. Didn't I just explain this? And thanks for pointing out the signature snafu, I didn't realize i had a comma in there, LOL. Jared Buck ---------------------- Project Gutenberg editor http://www.gutenberg.net From sly at victoria.tc.ca Thu Jan 6 16:06:35 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jan 6 16:06:52 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> References: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> Message-ID: On Thu, 6 Jan 2005, Joshua Hutchinson wrote: > Inka (I'm assuming that's who did the HTML version) must not have used any of GutCutter-style tools BilFlis over at DP created! ;) > > -- to — is not a requirement, as far as I know, but it is something that I see most people do. I see no problem with replacing -- with in the HTML, though. It would be a very simply find/replace to do so, too. > However, I do see a problem. Any "simple" global search/replace such as that has it's risks. You cannot assume that every instance of "--" is an emdash. For instance, what would happen to the following (from Roughing it in the Bush, PG#4389): "You were fortunate, C---, to escape," said a backwood settler, Andrew From stephen.thomas at adelaide.edu.au Thu Jan 6 16:09:22 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Thu Jan 6 16:09:36 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DD6791.7020200@srv.net> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> Message-ID: <41DDD332.3070408@adelaide.edu.au> Why not simply use server redirects so that anyone linking to the original location will be redirected to the new location for each text? E.g. (assuming Apache): Redirect permanent /etext90/mayfl* \ http://www.gutenberg.org/etext/7 (Not sure if that's entirely correct, but you get the idea.) This would be boon to those who've set up links elsewhere to specific texts, only to have the trashed by the relocation. There are probably many tens of thousands of such links that are currently broken. Steve -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From hacker at gnu-designs.com Thu Jan 6 16:20:30 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Jan 6 16:21:15 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DDD332.3070408@adelaide.edu.au> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> Message-ID: > Why not simply use server redirects so that anyone linking to the > original location will be redirected to the new location for each > text? mod_rewrite is the more-scalable approach. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From ag737 at freenet.carleton.ca Thu Jan 6 16:42:31 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Thu Jan 6 16:42:48 2005 Subject: [gutvol-d] Most relaxed? Message-ID: <11f5911215da.1215da11f591@ncf.ca> I wouldn't say MOST relaxed, there are a few African countries that take that prize, but we're good for several reasons: - physical proxmity to the US provides logistical and $ savings for volunteers; - part of the Anglo-American law world; we can, for now, stand as a bulwark against the extensionists, and the presence of such a library, will, if large enough, allow for the mobilization of a defence of the public domain; - English- and French-speaking; can make liaisons and bridges with the longer-life-plus-term UK and Ireland as well as the wierd-term US (there are thousands of PD books under the pre-1923 rule in the US that are NOT PD in life+ countries, including Canada; - large immigrant communities, and "heritage language" communities, among whose numbers we can recruit to help give the collection more internationality; - excellent net.infrastructure, esp. as compared to those countries with even less copyright; - several PD and digital library initiatives already in progress; - some potential corporate contributors; - already a strong base of PG/DP support, with one of the largest non- US contingents already involved. We can build a life+/CanCon version, and hopefully, by growing incrementally as PGDP I has, do so without destabilizing the Mothership. DRAWBACK: - Collective copyright administration; the literary, artistic and musical collective agencies will watch us like hawks. Scratch that; VULTURES. Which is why we have to have a failsafe clearance mechanism. I've got as good a knowledge of Canadian CR law, esp. as pertains to the undefined, in Canadian law, expression "public domain", as any non- lawyer does, and I've volunteered to help with the clearances. So, I repeat the call: Where are the tech people who can make this happen, and sooner rather than later? I'd start processing titles from Project Gutenberg Mothership according to their Life+50 eligibility as soon as the site is in place. We need a friendly host. ----- Original Message ----- >From Jonathan Ingram Date Thu, 6 Jan 2005 09:55:52 -0800 (PST) To Project Gutenberg Volunteer Discussion Subject Re: [gutvol-d] PG-50/70? We need a centralised PGUS-style storage system for life+X books that can be easily mirrored in other life+X countries, and that should preferably be based in the country with the most relaxed laws -- Canada. -- Jon Ingram From flis at detk.com Thu Jan 6 16:57:55 2005 From: flis at detk.com (William Flis) Date: Thu Jan 6 16:51:33 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: Message-ID: > For instance, what would happen to the following (from > Roughing it in the Bush, PG#4389): > > "You were fortunate, C---, to escape," said a backwood settler, I never use three hyphens. In fact, I search for them and change them to either two or four. I'd have set set this example as four hyphens, then in the HTML (automatically) converted each pair into an "mdash;". Two em-dashes look like one continuous (double-em) dash in the browsers I use (IE and Firefox). Bill Flis From stephen.thomas at adelaide.edu.au Thu Jan 6 17:54:21 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Thu Jan 6 17:54:37 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> Message-ID: <41DDEBCD.6050103@adelaide.edu.au> David A. Desrosiers wrote: >>Why not simply use server redirects so that anyone linking to the >>original location will be redirected to the new location for each >>text? > > > mod_rewrite is the more-scalable approach. Sure. Whatever. It's really up to the server admin -- Marcello? Depends on what resources he has available on that server. Also, neither approach helps with mirror sites, because these things won't get mirrored. Maybe a simple (if tedious) use of symbolic links would be better, because that should flow on to mirror sites. Steve -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From blondeel at clipper.ens.fr Thu Jan 6 18:22:56 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Thu Jan 6 18:23:11 2005 Subject: [gutvol-d] Re: German texts and the m-dash In-Reply-To: References: <20050106154330.0F8AC109912@ws6-4.us4.outblaze.com> Message-ID: <20050107022256.GB2396@clipper.ens.fr> Regarding the marking up issue, this is how I feel: PG TXT format is not meant to be read (it is ugly). It is meant to be "the" reference format, waiting for something spiffier (XML or the like). It is meant to be transformed in other formats, or viewed in nice reading tools (eg: PDA with proportional fonts, anti-aliasing, etc.). As such, typography has nothing to do in it: it is the backend's problem, that is to say it falls in the bailiwick of the program who will transform this basic interchange format into something else. (LaTeX does it automatically with babel packages for instance; XHTML could maybe do that with the right stylesheet --- then you won't have to worry about inserting all paragraph indents for example). When I type e-mails, even in French, I don't take the hassle to include semi- or full-length non-breakable spaces in front of ;:!?? and the like, or after ?. (By the way, I guess in German quotes work like this: He said: ?Hello? and not, like in French: He said: ?Hello?. I guess you code those quotes just as is in your raw text formats). E-mails are plain text in fixed-width font, not a printed book with nice typography. As long as you don't destroy information, you can afterwards translate those things properly respecting classical typography. I try to do that for the PDF backend in http://www.eleves.ens.fr/home/blondeel/PGDP/ebooksgratuits/ For instance, in a French text: * any "--" appearing in the beginning of a paragraph is a dialog dash that shold become "&endash; " or maybe "&emdash; " in HTML. * any other "--" is an em-dash that should become " &emdash; " in HTML (note the normal spaces: not unbreakable ones!) * maybe other rules that escape me now (number intervals?) On Thu, Jan 06, 2005 at 04:06:35PM -0800, Andrew Sly wrote: > However, I do see a problem. > Any "simple" global search/replace such as that has it's risks. > You cannot assume that every instance of "--" is an emdash. People who perform such search and replaces are supposed to know what they are doing. If you want to distinguish between "--" appearing in the beginning of a paragraph or others, for instance, you will run a contextual search and replace. I understand some people don't know how to do that and don't want to learn how to do that. Then they will have to cope with the imperfect typography, and wait for PG to move to other formats: if/when some structured formats appear on PG, life will be much easier. For example you could go: User: Hey! show me book XXX in HTML format Server: there you are: [...] - Nice. Make the font bigger, the margins narrower, the titles bolder, etc. [*] Server (compiling this format on the fly): - there you are: [...] - Man! I like that book. Give it to me in PDF format. - there you are: [...] - Right. Give me both portrait format so I can print it, and landscape format with a bigger font so I can read it a little on the screen. - there you are: [...] [*] note: this you could do on your own, just changing the stylesheet of the XHTML file (see examples at the URL above). But the website/layout engine could do that for you. I can already do all of the above with the ebooksgratuits experiment I mentioned above (well, of course you would use drop-down menus and not natural language; I mean I could if I took the time to code it, but there is nothing difficult there: the proof of concept is out there. The only slight problem is to teach LaTeX how to cut words, but my program gives me the list of the words LaTeX couldn't cut and their severity and context, and makes it possible for me to teach it how to cut them). As for the case mentioned here, maybe it is a PP issue. Of course the HTML version should respect more the typography. > For instance, what would happen to the following (from > Roughing it in the Bush, PG#4389): > > "You were fortunate, C---, to escape," said a backwood settler, This would fail the contextual search and replace. To implement the transformations I detail above, you could do this (sed syntax, but of course you would use an easier programming language): s/^--\([^-]\)/&endash; \1/ s/\([^-]\)--\([^-]\)/\1 &emdash; \2/g then you would check no "--" remain, you would check double spaces you may have introduced with the second transform (in case there were--wrongly--spaces around the "--" in the original text), etc. From shalesller at writeme.com Thu Jan 6 19:41:32 2005 From: shalesller at writeme.com (D. Starner) Date: Thu Jan 6 19:41:49 2005 Subject: [gutvol-d] PG-50/70? Message-ID: <20050107034132.EBE24164006@ws1-4.us4.outblaze.com> "Robert Shimmin" writes: > If someone finally gets around to producing a PDA that's as much joy to read as a printed book, > and ebooks become a major part of the publishing industry, we may see titles that were only ever > available via download. What do you mean, "we may see"? has a number of volumes available by download only. It's not that uncommon in the roleplaying industry. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From nwolcott at dsdial.net Thu Jan 6 19:48:36 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Thu Jan 6 20:01:11 2005 Subject: [gutvol-d] PG-50/70? References: <10d96c10e0be.10e0be10d96c@ncf.ca><20050106175552.3154.qmail@web41721.mail.yahoo.com> <20050106193836.GA18249@pglaf.org> Message-ID: <00b301c4f46d$7c7e96a0$d79495ce@gw98> Do we have any contacts with good relations with Iran? ----- Original Message ----- From: Greg Newby To: Project Gutenberg Volunteer Discussion Sent: Thursday, January 06, 2005 2:38 PM Subject: Re: [gutvol-d] PG-50/70? > On Thu, Jan 06, 2005 at 09:55:52AM -0800, Jonathan Ingram wrote: > > --- "Wallace J.McLean" wrote: > > > So when do we -- whoever WE is -- get PG-Canada, under life+50 rules, > > > up and running? > > > > Through DPEU I've already scanned and processed over 30 works which are public > > domain in life+70 (and hence life+50) countries, but not in the USA. Currently > > I have nowhere in the PG universe to publish them except PG-Australia, and I > > don't regard that as a viable long-term option -- at least, until I'm told > > otherwise. Even if PG Australia is happy to take my books, at the moment it's > > the only outlet for life+X texts, and we've seen with the Gone With The Wind > > fiasco the problems that can cause. We need a centralised PGUS-style storage > > system for life+X books that can be easily mirrored in other life+X countries, > > and that should preferably be based in the country with the most relaxed laws > > -- Canada. > > > > -- > > Jon Ingram > > Jon, I think the answer might be PG-EU. Have you talked with Zoran or > Branco or any of the other folks involved? If they're not ready yet, > I can talk with my buds at xs4all and maybe work on getting a server > there. This is life+70, not life+50. There's a pg-eu mailing list I > can put you in touch with, if you'd like. > > Canada is the only life+50 country left I can think of that's "in the > works," and they don't have a server yet. If you have a set of > documents, maybe it will help to move things along. The pgcanada list > is on http://lists.pglaf.org > > -- Greg > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From Gutenberg9443 at aol.com Thu Jan 6 20:56:21 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Thu Jan 6 20:56:41 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology Message-ID: <110.40a721d2.2f0f7075@aol.com> First-- As everybody seeks to get copyright laws revised, I have a suggestion that will warm the heart (and the cooking fire) of every writer on earth. I recommend that regardless of how long a period of time a copyright extends, the first year that a publisher does not pay the writer at least $2000 in that year, the book should be considered effectively out of print, and the copyright should automatically revert to the writer. During all those years that the publisher is sitting on a book it's not making an effort to distribute, it's often the case that the writer could do other things with that book, even--in some cases--turn it over to PGLAF. Someone asked me what my longest-surviving book has been. It is SCENE OF THE CRIME, from Writer's Digest Books, and it has been in print since 1992. The first printing sold out before the official release date, because of the Writer's Digest Book Club. After that, it remained viable, though the entire rest of the series it was part of died, as a result of three unexpected events: (1) A lot of police officers bought it on the grounds that it was more thorough and less boring than their official police science books; (2) The O. J. Simpson trial showed a lot of people what happens when a crime scene is worked by total idiots, and I was asked to comment on that fact on nationwide television; and (3) the CSI shows have been a success. My most recent royalty check, though, was pretty small, and I doubt it will make it through another year. It needs to be thoroughly revised and I'm not up to doing the work, and WD people don't want it revised anyway. They prefer to kill it. It has earned me a total of about $18,000. I spent an entire calendar year working on it. Despite the fact that crime scene work had been my job for years, I was determined to be totally correct and up-to-date with my research. At times I had as many as 75 library and ILL books in my office; as I also slept in my office at that time, things got pretty crowded. The last year I worked for the telephone company I earned $30,000. Admittedly I would far rather write a good book than sell telephone systems to businesses, but since when did it become improper for people to have a job that they like rather than a job that they don't like? I really loved my police work. I'd wake up in the morning and think, "Oh, darn, I can't go to work today." I was one of the best fingerprint examiners on earth. I could do stuff the FBI couldn't do. I was called in to help FBI agents, Secret Service agents, postal inspectors, Marine Corps CID, and various small-town police agencies. It gave me an incredible feeling of power when I had just made a nonsuspect ident--that is, identified a criminal by no clue at all except fingerprints made in a place only the criminal could have made them, by cold-searching the prints through all the fingerprint cards we had--but there's no way on God's green earth that I could do that work now. I also cannot possibly sell telephone systems to businesses, at which I was marginal at best, or teach students to learn how to write, at which I was fair to middling competent. So I'm back where I started when I was seven years old. I can write. I can edit. I can wash the dishes if I can stay out of bed long enough to do it. I apologize for recent outbursts on my part. I hope at least some of you can understand why they have occurred. Now I am going to crawl back into the woodwork and resume anonymity. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050106/4219f84e/attachment.html From sly at victoria.tc.ca Thu Jan 6 22:28:22 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jan 6 22:28:40 2005 Subject: [gutvol-d] Outdated information Message-ID: Hmmm.... It's interesting to note that as a result of still having the old promo.net webpage active, and turning up high on the list in search engines, I can find a reference to Project Gutenberg like the following: * Project Gutenberg -- Index One of the oldest and largest of electronic text archives, Project Gutenberg contained 6,267 eBooks as of November 2002 (the latest published figure as of December 2004). I think I'll look for a contact email, and send them a message... Andrew From sly at victoria.tc.ca Fri Jan 7 00:39:28 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Fri Jan 7 00:39:48 2005 Subject: [gutvol-d] Request for French Spell Checker Message-ID: I have a text "Le retour de l'exile" by Louis Frechette, that I have prepared for PG from another online source. It looks as if it is in good shape, but I have found an occasional scanning error in it. Would anyone have a French-language spell check they could run it through? I have tried to download one, but have had trouble installing it. An html version can be found here: http://www.victoria.tc.ca/~sly/RETOUR2.HTM.zip Alternatively, I could provide a plain text version. Thanks, Andrew From gbnewby at pglaf.org Fri Jan 7 01:26:53 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Jan 7 01:26:55 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <20050106203659.42713.qmail@web41724.mail.yahoo.com> References: <20050106193836.GA18249@pglaf.org> <20050106203659.42713.qmail@web41724.mail.yahoo.com> Message-ID: <20050107092653.GB4926@pglaf.org> On Thu, Jan 06, 2005 at 12:36:59PM -0800, Jonathan Ingram wrote: > > --- Greg Newby wrote: > > Jon, I think the answer might be PG-EU. Have you talked with Zoran or > > Branco or any of the other folks involved? If they're not ready yet, > > I can talk with my buds at xs4all and maybe work on getting a server > > there. This is life+70, not life+50. There's a pg-eu mailing list I > > can put you in touch with, if you'd like. > > > > Canada is the only life+50 country left I can think of that's "in the > > works," and they don't have a server yet. If you have a set of > > documents, maybe it will help to move things along. The pgcanada list > > is on http://lists.pglaf.org > > I'd love to know more about PG-EU, so please feel free to forward information > about the mailing list to me. We get occasional progress reports on PGEU > through posts on the DPEU forum, but I've not seen anything recently. I'm > looking forward to it starting up soon... Send PG-EU mailing list submissions to pg-eu@vrijschrift.org To subscribe or unsubscribe via the World Wide Web, visit http://mailman.vrijschrift.nl/listinfo/pg-eu or, via email, send a message with subject or body 'help' to pg-eu-request@vrijschrift.org You can reach the person managing the list at pg-eu-owner@vrijschrift.org -- Greg From traverso at dm.unipi.it Fri Jan 7 02:49:48 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Fri Jan 7 02:47:25 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <110.40a721d2.2f0f7075@aol.com> (Gutenberg9443@aol.com) References: <110.40a721d2.2f0f7075@aol.com> Message-ID: <200501071049.j07Anmx27091@posso.dm.unipi.it> I think that we have to revise the aim and implementation of copyright. Copyright has useful features (allowing writer to make a living) and bad features (its implementation being giving a long-term monopoly to publishers, all the disadvantages of monopoly appear). A short-term monopoly might be reasonable, to allow the initial publisher to recover the investment, but a long one is bad. I think that the solution is decoupling monopoly and author royalties. There is nothing that disallows a double system: for a short period (14+14?) the author has a monopoly that he can transfer to a publisher, negotiating his conditions; after that, there is no monopoly, everybody can republish, but should give to the author a fair share of the sale price of the published work. This might be state-guaranteed, through author's registration, that might be handled by an international authority (I am thinking at UNESCO) to avoid overriding by national law differences. There is the possibility that an author will earn less, though reduced prices, but might get better compensation through increased sales and better diffusion. This royalty right might as well be life+N, maybe even life+100; but the royalty collection right should be non-tranferrable, it should go to natural heirs only or (as a limit case) to a literary foundation, not transferred to a publisher. I think that such a proposal might be favorably accepted, since it does not reduce author's rights, only monopoly power (words have to be carefully chosen...) I think that such a proposal, beyond technicalities, should be acceptable to everybody (except monopolists, of course), and could gather a concensus with good slogans (down with monopolies!) This is the scheme that was designed by Victor Hugo, and not accepted. He was considering limited control and perpetual royalties, to be given to natural heirs or, if they do not exist, to a state foundation to encourage beginner authors. I have started translating from french to english his proposal: it would be useful to have a revision proposal of copyright based on the proposal of a "Noble Father". Carlo Traverso From blondeel at clipper.ens.fr Fri Jan 7 04:54:36 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Fri Jan 7 04:55:01 2005 Subject: [gutvol-d] Request for French Spell Checker In-Reply-To: References: Message-ID: <20050107125436.GA28485@clipper.ens.fr> On Fri, Jan 07, 2005 at 12:39:28AM -0800, Andrew Sly wrote: > I have tried to download one, but have had trouble > installing it. I reply on the list to ask at the same time: is there any effort to centralize such tools, lists of words (for different centuries and languages), people competent in such or such language, etc.? I usually use Paul Zimmermann's epelle. Sometimes I have to tweak a program to clean up the source and avoid useless warnings, especially when the work is in several files (I include special words in a "local dictionary" file but as I didn't understand how to tell epelle to use it, I programmed a filter upstream) > An html version can be found here: > http://www.victoria.tc.ca/~sly/RETOUR2.HTM.zip The HTML version "shouldn't" have em-dashes as they are, especially in French (see my previous message where I back this opinion up a little). > Alternatively, I could provide a plain text version. w3m -dump did it for me. w3m -dump Retour2.htm | perl -pe 's/([^-])--([^-])/$1 $2/g' > Retour2.txt $ epelle -8bit Retour2.txt | wc -l 339 There are scannos all right. (I developed a scanno-finder program as well as other help tools; see http://www.eleves.ens.fr/home/blondeel/PGDP/AutoCorrect/ about that) The problem of epelle is sometimes it muches stuff: tata' could become tat? and you are lost to find the problem in the file... this is where I use other programs around it. The best is to do that interactively, with context, using ispell with a French dictionary for example. If mdashes are a problem add spaces around them, ispell the resulting file, and switch mdashes back like they were. (File sent in private). From laneb at cpsc.ucalgary.ca Thu Jan 6 09:05:21 2005 From: laneb at cpsc.ucalgary.ca (Brendan Lane) Date: Fri Jan 7 10:43:08 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <15.3ba34d31.2f0ebf54@aol.com> References: <15.3ba34d31.2f0ebf54@aol.com> Message-ID: On Thu, 6 Jan 2005 Gutenberg9443@aol.com wrote: > This is not a flame. It's a careful exchange of views with whoever I'm > quoting. That was Jeroen, and I'll let him answer questions about whether he agrees with Karl Marx and Jesus Christ himself. I have a few comments about this: > ...Why should the publisher be paid . . . and the bookseller be paid . . > . and the ink supplier be paid . . . and the paper supplier be paid . . > . and the shipping companies be paid . . . and so on and so forth, for > as long as people want to read a book, but the person who wrote the book > should fall out of the loop and stop getting paid? Making the perhaps foolish assumption that this isn't just a rhetorical question, I shall attempt an answer: _Because the person who wrote the book isn't doing any new work._ The publisher is still printing it, the bookseller is still selling it, the ink and paper suppliers are supplying ink and paper, the shipping companies are still shipping it around...but the writer has nothing to do with this. It's possible that the writer hasn't done anything but sit on the couch and watch TV for twenty years. Perhaps this is belabouring the obvious, but this is the fundamental problem the copyright system is meant to address -- that it's not necessary that the author get paid more than once, since he's already done all of his work before the book is ever printed. Of course, we want authors to keep writing new books, so we agree to give them this legal "copy-right" which necessitates bringing them back into the loop -- now they _are_ performing some economically useful activity (even if it's just agreeing to let the book be published) and so can claim some reward, i.e. royalties, which exceeds what they could have gotten if they'd been paid up front, rewards them if the book is a hit, and so on. If you're concerned that someone is making money off a book while the writer is "cut out of the loop", well, that's an argument for _perpetual_ copyright, not limited copyright. People are printing Dickens' works, making movies about them...hell, the number of showings of A Christmas Carol in the month of December alone is enormous. Lots of people are making a lot of money off Charles Dickens, yet he (well, his estate, but since you argue for Life+, I assume you equate the estate with the author in some way) doesn't see one red cent -- he's been "cut out of the loop". The only way to make sure that as long as anyone makes money off your work, that you (or your estate) does too, is eternal copyright. > Life plus 100, or life plus 70, or life plus 60, is absurd. I agree. There are those who don't. Some think that eternal copyright is the only fair regime. (If not forever, then perhaps forever less a day.) > Life plus 25 is not absurd. Maybe. I happen to think it is still far too long. What Jeroen, Robert, Wallace, and perhaps others who I can't remember right now, are saying is that maybe there's a rational way to determine what's absurd or not, what's fair or not. As you say above, this is not a flame, just a careful (and frank) exchange of viewpoints. Brendan Lane From hart at pglaf.org Fri Jan 7 11:10:15 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 7 11:10:16 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <200501071049.j07Anmx27091@posso.dm.unipi.it> References: <110.40a721d2.2f0f7075@aol.com> <200501071049.j07Anmx27091@posso.dm.unipi.it> Message-ID: On Fri, 7 Jan 2005, Carlo Traverso wrote: > > I think that we have to revise the aim and implementation of copyright. > > Copyright has useful features (allowing writer to make a living) and > bad features (its implementation being giving a long-term monopoly to > publishers, all the disadvantages of monopoly appear). Jason Pontin, editor-in-chief of the MIT Technology Review says: "Copyright is the essence of intellectual creation. . . ." Of course, he neglects to take into account how many of the great works were created before copyright or by those who were against copyright, such as Milton. Not only were the greatest writers before copyright, such as Shakespeare and Dante, it is quite likely that they couldn't have published what they did under current copyright laws. From jeroen.mailinglist at bohol.ph Fri Jan 7 13:08:29 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Fri Jan 7 13:07:30 2005 Subject: [gutvol-d] PG-50/70? In-Reply-To: <15.3ba34d31.2f0ebf54@aol.com> References: <15.3ba34d31.2f0ebf54@aol.com> Message-ID: <41DEFA4D.7000808@bohol.ph> Gutenberg9443@aol.com wrote: >Do I understand correctly that the fellow who works in the steel mill should >be paid only what is necessary for the steel to be produced? And the >waitress in a restaurant should be paid only enough for the orders to be taken and >the food put on the table. Well, let's see, how much is the wear and tear on >her shoe leather and her apron and the clothing she wars at the restaurant? > > You got the point in the first sentence, which is correct. The basic idea of a free market is that producers compete for customers, and as a result, prices drop to what is necessary to produce them, including the salary of the people doing the work, but without undue welfare provisions or other payments that are not necessary to get the work done. -- and to make things even more clear, I am not even promoting a totally free market, because such a market would not have copyrights at all, and actually am for considerable more market regulation in many fields, as I do like to see everybody in a reasonable standard of living. I have lived long enough in India, and half my relatives live in the Philippines in a house smaller than the average American garage to know the realities of "free market" with an surplus of labour. One reason I am working on PG is because I promised myself to provide every school in my wife's province with a decent library and access to sufficient study materials. >For cryin' out loud, YOU CANNOT HAVE LABOR OF ANY KIND, WHITE COLLAR OR BLUE >COLLAR, WITHOUT PAYING THE LABORER. Karl Marx and Jesus Christ agree on >this, though they disagree as to how it should be done. > > > Why do you think that was not clear to me? However, who tells me I or anybody should pay for labour we didn't ask for. There is were communism failed, as nobody wanted to work for its ideals anymore, how nice and appealing they are. Sharing, after all is much nicer than being greedy, isn't it? >It is 5:40 PM on a snowy day and we just called the plumber. Why should we >have to pay him $35 just for coming here, before he even looks at the problem? >Does it cost that much in gasoline? Surely it isn't the cost of the truck, >because it's old enough that it's already paid for. > > Because you asked him to come, and agreed on that fare beforehand. You have an option of not calling him. Whatever he needs to pay with the money is none of your business. If he can't pay for his cost from his earnings, he will be forced to increase his prices or go broke, if he earns shiploads of money, he will face competition until the profit margins go to reasonable levels, which historically are in the order of 10 to 20 percent of your revenue. >I have been working for SIX YEARS on one book. At the moment it's about >200,000 words long. I have thrown away closer to two MILLION words that I wound >up tossing and rewriting. Let's see, what is the cost of the paper . . . and >the ink . . . and the computer . . . and the printer . . . Does that sum up the >cost of writing the book? And I shouldn't get any more than that? > > Having spend such hard work doesn't give you any right to a monetary reward, not even copyright works that way. Your work will have to add value for customers willing to give you money for it. If they don't want your work, too bad, and you loose, if they do, you earn, and only then copyright is your friend - since it DOES give you an opportunity to earn back your investment, which would otherwise be taken away by people who didn't invest those six years, and could copy your work right after you sold the first copy. >WHY? I would make this 100-point type except that it would wind up perfectly >ordinary type on other people's machines. Why should the publisher be paid . >. . and the bookseller be paid . . . and the ink supplier be paid . . . and >the paper supplier be paid . . . and the shipping companies be paid . . . and >so on and so forth, for as long as people want to read a book, but the >person who wrote the book should fall out of the loop and stop getting paid? > > Because, and I can say it in any point size you wish to display to message in, they too offer a service that has value to people willing to pay for it. You seem to be emotional, but I never proposed to abolish copyright, invented to fix the problem of free-riders, only to reduce it to durations that make economic sense. >Life plus 100, or life plus 70, or life plus 60, is absurd. Life plus 25 is >not absurd. That's all I'm asking for. But too many people think I should get >royalties for 10 years or 15 years and then no more, even when everybody else >is still making money from the book. > > > I consider anything based on the life of the author absurd, as it discriminates on age (why should older authors get a shorter return on their investment, even though the difference in present value between authors who are 25 and 75 is very small), and makes it extremely difficult to establish the copyright status of a work. Having a simple flat rate is both non-discriminatory and much more convenient to society. I can also ask the question the other way round, why should people have the right to stop me from copying your works. It is not your property, and when I copy it, I use my labour, my materials, and do it in the privacy of my house with my effort only, and take nothing from you at all. And still you want to take that freedom away from me? What for? If you look at it another way, should the maker of a taxi always get paid when the taxi driver earns money using that car. No of course not, the car factory sets its price, sells the car, and then it up to the new owner of the car what he wants to do with it. Or should we introduce royalties for car factories, to be paid every time somebody rides a car they made...? Copyright is not a natural law, and should not be confused with physical property. It is an attempt of the legislator to fix the problem of free markets. It is a restraint society offers to book writers due to a particular nice feature of books: they are very easy to copy. But I think, society should not make this offer longer than necessary, since you have a choice not to write the book at all if you don't like the offer, and if you can't earn back your investment in your book, you should either increase your selling price, or turn to another trade. The public interest, however is to have books published, and thus to enable writers to write books, and thus to make an offer that is sufficient to allow them to do so. If it takes you six years to write a book, you probably require about six years worth of a decent income, so actually having 28 years is a very generous offer, and it is actually bad for society if they paid more in two ways: First: they can't spend the money on other authors to write other books, and second, why would you ever write another book if that first one already gives you enough income. If your book is valuable, I want you to write more of them... Copyright is not and never supposed to be a welfare scheme, but has been subverted into something like it by greedy lobbyist. Copyright is to promote science and arts, and should be nothing more than that. To all authors seeking eternal copyright welfare, I can only say, "Get a real Job!" Jeroen Hellingman From joshua at hutchinson.net Fri Jan 7 15:36:58 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 7 15:36:51 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <002301c4f449$29bd0560$4c69c7ac@jared> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <002301c4f449$29bd0560$4c69c7ac@jared> Message-ID: <41DF1D1A.4050706@hutchinson.net> Now a PG of Russia idea I like! You have my full support on that one! :) Jared Buck wrote: > No offense, Josh, this is a free country and you are welcome to > provide your opinion as you see fit :) I'm trying to start a PG of > Russia, I just got to get in touch with my girlfriend in Moscow, she'd > be interested in helping once she's out of school later this month :) > Prof. Hart and Greg Newby would like to see a Russian PG, and I have > the time and resources to spend to work on one. > > Jared Buck > ---------------------- > Project Gutenberg editor http://www.gutenberg,net > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From marcello at perathoner.de Fri Jan 7 10:18:47 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Jan 7 15:42:58 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DDD332.3070408@adelaide.edu.au> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> Message-ID: <41DED287.6050806@perathoner.de> Steve Thomas wrote: > Why not simply use server redirects so that anyone linking to the > original location will be redirected to the new location for each text? Because maintaining that will be a nightmare and a server performance hog. We already have an error.php which tries to figure out what the user wanted to get and redirects the user accordingly. Adding a lookup into the database would be easy. Regrettably I already deleted most of the old files from the database. I have to see if I can get them back in. -- Marcello Perathoner webmaster@gutenberg.org From JBuck814366460 at aol.com Fri Jan 7 15:55:27 2005 From: JBuck814366460 at aol.com (Jared Buck) Date: Fri Jan 7 15:55:43 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com><002301c4f449$29bd0560$4c69c7ac@jared> <41DF1D1A.4050706@hutchinson.net> Message-ID: <000701c4f514$5e001030$e441c4ac@jared> Heh, that's good to know you have my back :) I speak a little Russian, but with my girlfrienf working together with me on a PG-Russia, such a site would be both in English and in Russian (especially for people studying Russian who are required to read classic Russian works in their original language). Implementing a PG-Russia shouldn't be that hard, and both Greg and Michael have lended their support to me for doing this, including the possibilty of lending me some server space to host the site, which likely would be located at an address like http://www.gutenberg.ru or something similar. Jared Buck ---------------------- Project Gutenberg editor http://www.gutenberg.net ----- Original Message ----- From: "Joshua Hutchinson" To: "Project Gutenberg Volunteer Discussion" Sent: Friday, January 07, 2005 3:36 PM Subject: Re: [gutvol-d] Project Gutenberg Original Directory Structure > Now a PG of Russia idea I like! You have my full support on that one! :) > > Jared Buck wrote: > >> No offense, Josh, this is a free country and you are welcome to provide >> your opinion as you see fit :) I'm trying to start a PG of Russia, I >> just got to get in touch with my girlfriend in Moscow, she'd be >> interested in helping once she's out of school later this month :) Prof. >> Hart and Greg Newby would like to see a Russian PG, and I have the time >> and resources to spend to work on one. >> >> Jared Buck >> ---------------------- >> Project Gutenberg editor http://www.gutenberg,net >> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d >> > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From marcello at perathoner.de Fri Jan 7 16:02:03 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Jan 7 16:02:08 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DDEBCD.6050103@adelaide.edu.au> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> <41DDEBCD.6050103@adelaide.edu.au> Message-ID: <41DF22FB.8070806@perathoner.de> Steve Thomas wrote: > Sure. Whatever. It's really up to the server admin -- Marcello? > Depends on what resources he has available on that server. Try this one: http://www.gutenberg.net/etext03/napol10.txt -- Marcello Perathoner webmaster@gutenberg.org From traverso at dm.unipi.it Sat Jan 8 10:01:03 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sat Jan 8 09:58:12 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <000701c4f514$5e001030$e441c4ac@jared> (JBuck814366460@aol.com) References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com><002301c4f449$29bd0560$4c69c7ac@jared> <41DF1D1A.4050706@hutchinson.net> <000701c4f514$5e001030$e441c4ac@jared> Message-ID: <200501081801.j08I13324875@posso.dm.unipi.it> A kind of PG-Russia already exists: http://lib.ru/ (not that I can really read it, but I can see the list of authors, and access the books). Carlo Traverso From blondeel at clipper.ens.fr Sat Jan 8 13:47:53 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Sat Jan 8 13:48:05 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <110.40a721d2.2f0f7075@aol.com> References: <110.40a721d2.2f0f7075@aol.com> Message-ID: <20050108214753.GA7234@clipper.ens.fr> Good! Numbers to munch. We made the quantitative jump. Very nice. You say your most successful book so far, probably http://www.amazon.com/exec/obidos/tg/detail/-/0898795184/102-5247226-9968935 took you more than 2000 hours of work and brought about 18,000 USD. That is 6 to 9 dollars an hour of work... Hardly more than the minimum wage in the US, about the minimum wage in France. And one is not even sure to get that when he works on a book. Most books bring their author much less money... Let's make a little detour through the music and record industry, which resembles the book industry in some ways (and is more in the headlines these days). A few months ago there was this French singer on a TV show. | Host: We haven't seen you in medias for a long time. What have you | been doing? | | Singer: nothing! | | Animator: How do you make your living then? | | Singer: the day I opened my mouth and sang "Capri", I made myself a | living for the rest of my life. 40 years after, it still brings me | a good 3000 to 4000 EUR a month. And he said this rather arrogantly. But such people are an exception[*]. [*] Most singers members of the SACEM (a French company managing royalties for music) get nothing but peanuts. Interestingly, if you sign up with SACEM any time in your life, all your subsequent songs MUST be managed by them, you can no longer produce Creative Commons, public domain stuff, editor your records yourself or go and work with other guys. The French music majors have been orchestrating a media public relation campaign for a few months on the subject of file sharing on the Internet. See one of their ads at: http://zmaster007.free.fr/pubsnep.htm They repeat "this is stealing, artists must make a living, saying a CD costs very little to make is like saying the movie _Gone with the Wind_[*] just cost the price of its footage...". [*] Funny he used that example... This is very efficient in timed, reflexionless and spectacular speak, as in medias, but it forgets a number of things: 1/ most of the money does not go to the artists (sometimes majors also say "we spend it on marketing, etc."... as if they really helped discover "new talents"!) 2/ very few people live with their music How many French people make their living with the income they get from the mere sales of their records? books? (and derived products) out of 60 million people? I would say just a few hundreds, maybe thousands. A little more if you kick in people living with concerts but then this looks more like a "real job". Most books are written by people who have a "normal" job on the side. Most music bands people have a job on the side. So less copyright "protection" would not mean less creation. And this "support the artist" idea is bogus: 1/ most artists don't benefit from it 2/ don't deprive millions for the sake a a few tens 3/ creation would go on any way. Much of the time, the biggest successes are made by unknown people. Don't you have in mind cinema movies for which the original one was genuine and the sequels nothing but an attempt to squeeze more money out of the public because the first opus proved to be popular? Anne, that book is 240 pages long. It took you, say, 10 hours a page to make. Suppose you are a very inefficient (or, more positively, an especially caring and researching) author, and it takes most authors less time for every page they produce (how little? 5 hours? 2 hours?). Still, we could derive from sales numbers[*] (is that data available somewhere?) the expected income of all books for each hour spent on it and see whether it is possible or not for their authors to make a living with their typewriter. [*] You dont tell us how many such books you sold. At a "list price" of 16.99 USD, if you were in France and on a typical edition contract, you would make about 1.30 USD for each book sold. It would therefore take about 14,000 books sold to give you that kind of revenue. Now, of course, the question is: is there any point imagining new copyright schemes and new laws here? Even if we could reach an agreement, the road would still be long to any chance of publicizing it. From jeroen.mailinglist at bohol.ph Sat Jan 8 15:20:58 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Sat Jan 8 15:19:44 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <20050108214753.GA7234@clipper.ens.fr> References: <110.40a721d2.2f0f7075@aol.com> <20050108214753.GA7234@clipper.ens.fr> Message-ID: <41E06ADA.3060805@bohol.ph> Sebastien Blondeel wrote: >[*] Most singers members of the SACEM (a French company managing >royalties for music) get nothing but peanuts. Interestingly, if you sign >up with SACEM any time in your life, all your subsequent songs MUST be >managed by them, you can no longer produce Creative Commons, public >domain stuff, editor your records yourself or go and work with other >guys. > > The Dutch BUMA/STEMRA has simular all or nothing clauses in their contracts, but at least you can break with them (but then have to withdraw all your works from them; and since they are an effective monopoly, you have nowhere else to say. I am thinking about filing complaints about this with anti-monopoly authorities, but need musicians or others directly affected by this who are willing to join in, and find people in such an organization that it is NOT the copyright monopoly itself that I am complaining about. Jeroen. From hacker at gnu-designs.com Sat Jan 8 15:25:35 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 15:26:20 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: This site[1] appears to have converted a large portion of the Gutenberg etexts to Microsoft Reader (.lit) format, and are selling a CD of these for $14.95/cd. They clearly state[2] that they're reusing Gutenberg content directly, reformatted and converted to this other proprietary format. Closer inspection of the actual .lit files themselves still shows several references to Project Gutenberg in various places, but they aren't complying with the Gutenberg License[2], at least sections 1 and 2 that I can see: 1. Commercial use: The "small print" license includes a royalty schedule for commercial use of the Project Gutenberg trademark, including any sort of resale. 2. Modification: Only unmodified copies of these eBooks may be redistributed without limitation. If you make changes to eBooks (other than alteration for different display devices), you are prohibited from including the header or otherwise associating your derivative work with Project Gutenberg. Note especially that if you choose to make changes, then you must remove the header, and you may NOT use the Project Gutenberg trademark. Is this a violation? I only ask, because _another_ company is giving away the .lit books from the first site, on their own site[2] free of charge. I'd hate to have one bad company suck another one in through association. [1] http://freeeliterature.com/index.htm [2] http://freeeliterature.com/CD1%20Contents.htm [3] http://www.gutenberg.org/license [4] http://www.diesel-ebooks.com/cgi-bin/category.cgi?category=free_download David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From stephen.thomas at adelaide.edu.au Sat Jan 8 17:59:38 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Sat Jan 8 17:59:56 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41DF22FB.8070806@perathoner.de> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> <41DDEBCD.6050103@adelaide.edu.au> <41DF22FB.8070806@perathoner.de> Message-ID: <41E0900A.8090100@adelaide.edu.au> That's great, exactly what I wanted -- and I see that napol10.zip also takes you to the catalog page. So I assume you are redirecting napol10.* Would it be possible to also redirect napol* -- so that you'd get the catalog page regardless of version? That would mean that anyone with an old link would then discover thru the catalog that there was a later version. I guess your main problem is that some works have been relocated to the new structure, while others have not, which is going to make things more complex. How hard would it be to just do a mass-migration of all works to the new structure, and then put in place a redirection for all old-style links? (I mean a migration only in the sense of moving the files. The updating work could be done later.) (That would actually be pretty nasty for mirror sites, because they'd suddenly need to update all the old works at once. But that has to happen some time.) Steve Marcello Perathoner wrote: > Steve Thomas wrote: > >> Sure. Whatever. It's really up to the server admin -- Marcello? >> Depends on what resources he has available on that server. > > > Try this one: > > http://www.gutenberg.net/etext03/napol10.txt > > > -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ From gbnewby at pglaf.org Sat Jan 8 18:48:30 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sat Jan 8 18:48:32 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41E0900A.8090100@adelaide.edu.au> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> <41DDEBCD.6050103@adelaide.edu.au> <41DF22FB.8070806@perathoner.de> <41E0900A.8090100@adelaide.edu.au> Message-ID: <20050109024830.GA20045@pglaf.org> On Sun, Jan 09, 2005 at 12:29:38PM +1030, Steve Thomas wrote: > ... > How hard would it be to just do a mass-migration of all works to the new > structure, and then put in place a redirection for all old-style links? > (I mean a migration only in the sense of moving the files. The updating > work could be done later.) The procedure is that files that get moved get updated: - new header - run through gutcheck and other checking It's a manual process. The advantage is that after things are moved, we have a certain level of confidence in their formatting & quality. -- Greg From j.hagerson at comcast.net Sat Jan 8 19:04:44 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Sat Jan 8 19:04:58 2005 Subject: FW: [gutvol-d] Project Gutenberg Original Directory Structure Message-ID: <000801c4f5f7$f8b9cd30$6401a8c0@sarek> I believe that there would be an issue redirecting some links of the type napol* because we have the same book name in two different ebook* directories for two different eBooks. Steve Thomas wrote: That's great, exactly what I wanted -- and I see that napol10.zip also takes you to the catalog page. So I assume you are redirecting napol10.* Would it be possible to also redirect napol* -- so that you'd get the catalog page regardless of version? That would mean that anyone with an old link would then discover thru the catalog that there was a later version. From hacker at gnu-designs.com Sat Jan 8 19:29:22 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 19:30:23 2005 Subject: [gutvol-d] Dead, down, broken, missing, empty Gutenberg mirrors Message-ID: I just took a few minutes to check the mirror list (mostly because the ibiblio rsync mirror is now so slow, a 9600 modem would be faster at sending bits across). Here's what I found: d == domain down/dead a == access required r == connection refused e == empty collection w == wrong path (for ftp mirrors) 4 == 404 (for http mirrors) l == limit reached [d] ftp://ftpbook.dhs.org/mirrors/gutenberg/ [a] ftp://elib.phil.pku.edu.cn/pub/gutenberg/ [r] ftp://dlib.eramisp.com/gut/ [e] http://gutenberg.kk.dk/ [w] ftp://www.artfiles.org/gutenberg.org/ [e] ftp://ftp.iif.hu/pub/gutenberg/ [4] http://gutenberg.unipmn.it/mirror/ [l] ftp://ftp.mirrorservice.org/sites/metalab.unc.edu/pub/docs/books/gutenberg/ [w] ftp://ftp.samurai.com/pub/gutenberg/ [d] http://www.atexiansattic.com/www2/gutenberg/ [e] http://www.spiritdancers.org/pg/ Maybe this will be useful to others who maintain that SQL dump list. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From hacker at gnu-designs.com Sat Jan 8 19:48:04 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 19:48:21 2005 Subject: [gutvol-d] Dead, down, broken, missing, empty Gutenberg mirrors In-Reply-To: References: Message-ID: > I just took a few minutes to check the mirror list (mostly > because the ibiblio rsync mirror is now so slow, a 9600 modem would > be faster at sending bits across). Incidentally, the Gutenberg rsync page[1] could use a minor optimization/update to reflect shorter, more-concise rsync options. Currently, it states: rsync -rlHtSv --delete Those can be shortened to: # or HavS/SHav if its easier to remember rsync -avHS --delete In my own case, I also use -z and --partial, and exclude many of the files that aren't necessary to pull across for my mirror (the DVD and enormous genome datafiles, for example). Hope this helps. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From JBuck814366460 at aol.com Sat Jan 8 20:08:51 2005 From: JBuck814366460 at aol.com (JBuck814366460@aol.com) Date: Sat Jan 8 20:09:11 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: <7a.6a5c69e0.2f120853@aol.com> You bet it is - these ebooks cannot be redistirbuted unless the FULL header is included somewhere in the etext, either at the beginning of it or at the end. At least, that is what I gather from having read through the header many times. PG offers CDs AND DVDs of its etext collections for FREE without you needing to pay a cent. This company, IMHO, is just trying to rip people off by making them pay for something you can get free of charge from PG just by asking. Jared -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/dd169ef8/attachment.html From JBuck814366460 at aol.com Sat Jan 8 20:12:30 2005 From: JBuck814366460 at aol.com (JBuck814366460@aol.com) Date: Sat Jan 8 20:12:52 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: <6a.4c51b19a.2f12092e@aol.com> I should also mention that you can convert etexts to .lit format yourself with software on the net or from microsoft, and with pretty much either free or minimal cost to yourself. Not all PG etexts are in .lit format, but if you want to submit a copy of the etext in .lit format to be uploaded to the PG servers, anyone is welcome to do that, etexts in just about ANY computer-readable format are welcome :) Jared -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/93298605/attachment-0001.html From hacker at gnu-designs.com Sat Jan 8 20:15:02 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 20:15:22 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: <7a.6a5c69e0.2f120853@aol.com> References: <7a.6a5c69e0.2f120853@aol.com> Message-ID: > PG offers CDs AND DVDs of its etext collections for FREE without you > needing to pay a cent. This company, IMHO, is just trying to rip > people off by making them pay for something you can get free of > charge from PG just by asking. I've actually asked PG several times (via email) for copies of the DVD so I can hand them out at my LUG[1] (I'm the president). It's been a few months, and I haven't heard a single peep or reply to those emails yet. That's partly why I'm mirroring it on my own here now. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d !DSPAM:41e0ae6489551659916029! From hacker at gnu-designs.com Sat Jan 8 20:17:00 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 20:17:23 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: <6a.4c51b19a.2f12092e@aol.com> References: <6a.4c51b19a.2f12092e@aol.com> Message-ID: > I should also mention that you can convert etexts to .lit format > yourself with software on the net or from microsoft, and with pretty > much either free or minimal cost to yourself. I wish I could do this in an automated fashion, with a tool that runs from the shell, ideally in non-Windows environments (i.e. Linux and BSD). The only way I've seen to do this, requires loading each doc in Microsoft Word, clicking the Reader/Conversion button on the toolbar, picking a SaveAs filename, and saving it back to disk. A long, boring, tedious process, to be sure. There is an SDK, and I could probably write something around Mono to work with it, but it wouldn't be pretty. I'd just as soon stay away from proprietary formats as much as I can. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d !DSPAM:41e0af5490011626315990! From JBuck814366460 at aol.com Sat Jan 8 20:17:41 2005 From: JBuck814366460 at aol.com (JBuck814366460@aol.com) Date: Sat Jan 8 20:18:02 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: I got a PG DVD myself here at home :) sometimes it just takes a little while, Aaron Cannon is a little busy sometimes. If you haven't gotten the DVD yet, email him and let him know the situation, he'll get you a copy :) You also might want to appraise Mike Hart of the situation, he can probably get ahold of Aaron and see to it that you get your DVD. Jared -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/a5716622/attachment.html From hacker at gnu-designs.com Sat Jan 8 20:17:31 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 20:18:23 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: References: <7a.6a5c69e0.2f120853@aol.com> Message-ID: > I've actually asked PG several times (via email) for copies of the > DVD so I can hand them out at my LUG[1] (I'm the president). Oops, forgot the obligatory link to our LUG: http://www.eclug.net/ David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d !DSPAM:41e0b00890455471753401! -------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d !DSPAM:41e0b00890455471753401! From JBuck814366460 at aol.com Sat Jan 8 20:19:55 2005 From: JBuck814366460 at aol.com (JBuck814366460@aol.com) Date: Sat Jan 8 20:20:17 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: <24.67fa9e97.2f120aeb@aol.com> I usually prefer either .txt or .html myself, LOL. PDF is also good (the new version coming out soon will offer Linux support, so I have heard. I run Red Hat Linux 9 on my machine in a dual-boot environment (alongside Wincrap XP) and will shortly upgrade RH9 to Fedora Core 3 once I finish downloading and burning the ISOs to CD-R next week. I LOVE my new DSL connection, it's very handy ;) Jared -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/22e91c0f/attachment.html From hacker at gnu-designs.com Sat Jan 8 20:20:23 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 20:21:23 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: References: Message-ID: > You also might want to appraise Mike Hart of the situation, he can > probably get ahold of Aaron and see to it that you get your DVD. I'll probably be getting a DVD burner soon.. might make the whole situation a LOT easier. Once I have the full mirror here to work with, I'm planning on making a LOT more contributions back to PG, once I wrap my head around all of the pieces and parts. I definately see some areas for a little improvement (non-conflicting, of course. I don't want to start that "My format is better!" turf war here again). David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com -------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d !DSPAM:41e0b08091081303059254! From hacker at gnu-designs.com Sat Jan 8 20:22:37 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Sat Jan 8 20:23:23 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: <24.67fa9e97.2f120aeb@aol.com> References: <24.67fa9e97.2f120aeb@aol.com> Message-ID: > I usually prefer either .txt or .html myself, LOL. PDF is also good > (the new version coming out soon will offer Linux support, so I have > heard. New version of what? PGDVD? Acroread? I've been using PDF on Linux for several years now, with acroread, xpdf, Oo.org, Perl, and other tools, without much difficulty. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From JBuck814366460 at aol.com Sat Jan 8 20:26:04 2005 From: JBuck814366460 at aol.com (JBuck814366460@aol.com) Date: Sat Jan 8 20:26:27 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? Message-ID: Acrobat's ver. 7.0 should be out in a few months, will have new features (I'm not entirely up to spec on what's new, but I read it will have enhanced Linux support). New ver. of Reader will come with that too. Jared -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/fea7d3d1/attachment.html From j.hagerson at comcast.net Sat Jan 8 20:36:24 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Sat Jan 8 20:36:44 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: Message-ID: <002d01c4f604$c9c41730$6401a8c0@sarek> We now have a queue for requesting the PG DVD. Please use this web page http://www.gutenberg.org/cdproject/dvdreq-usa.html in the US and this one http://www.gutenberg.org/cdproject/dvdreq-int.html outside the US. John Hagerson (one of the) Project Gutenberg Media Fulfillment Volunteer(s) -----Original Message----- From: gutvol-d-bounces@lists.pglaf.org [mailto:gutvol-d-bounces@lists.pglaf.org] On Behalf Of JBuck814366460@aol.com Sent: Saturday, January 08, 2005 10:18 PM To: gutvol-d@lists.pglaf.org Subject: Re: [gutvol-d] Is this a Project Gutenberg violation? I got a PG DVD myself here at home :)? sometimes it just takes a little while, Aaron Cannon is a little busy sometimes.? If you haven't gotten the DVD yet, email him and let him know the situation, he'll get you a copy :)? ? You also might want to appraise Mike Hart of the situation, he can probably get ahold of Aaron and see to it that you get your DVD. ? Jared From Gutenberg9443 at aol.com Sat Jan 8 20:47:20 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Sat Jan 8 20:47:41 2005 Subject: [gutvol-d] in case you're interested . . . .OT but not totally irrelevant Message-ID: <1c1.2210ddcf.2f121158@aol.com> I'm starting an e-publishing company. The only thing it will use any time soon from PG is one book which I put on PG myself, stating that it was copyrighted but that I was giving PG a permanent non-exclusive loan of the copyright, provided that nobody used it for profit. That last clause didn't work till the water got hot--I found it a few weeks later on somebody's for-profit site, with my copyright notice still on it. I spent about six months on that book and have yet to receive a penny for it, which in that case is okay because I wasn't doing it for profit. Well, I take that "not a penny" back sort of, because the guy who had it on his for-profit site agreed to give me a different book he had there in return for the use of my copyrighted book. He couldn't have asked the author of the other book if that was okay due to the fact that the author of the other book is dead, though not long enough for his book to be public domain unless the first copyright wasn't renewed. If I ever chance to use anything else from PG, I will reedit it and probably rewrite it somewhat; and it will be at least a hundred years old and therefore not subject to even the most insane copyright laws. I have taken that Amazon thing out of the Webpage about six times but it comes right back without permission. That ticks me off considerably because the only things I sell through Amazon are used treebooks (that I wrote, and they aren't really used at all but I have to sell them as such), and I don't have any even posted there at the moment. At least when I'm selling my own books through my own publishing company I can set the price myself, which is somewhat less than the company that originally published them set. If the author had any say-so on the price of the book, nobody would ever have expected anybody to pay $30 for something that came out of my head. They would have sold a lot better if the price had been a lot more reasonable, but authors have no say-so whatever on jacket design, price, release date, sneaky and usually artless rewriting behind the author's back, and so on and so forth. Also, authors very rarely get much over 10% of the selling price (not the marked price). Besides my own books, I have six-and-a-half good fantasy novels from a dead friend (signed on that with her literary executor last week, and have to finish writing the last book), two collections of literary short stories coming in when they're finished, a grammar reference book to be written by the best grammarian I have ever met, and so forth--I've asked my husband to write a book about how value is created but he doesn't have time right now and I don't want one by anyone who is not in the Ludwig von Mises camp of economy--last time I checked I'm committed to 50 books, about a third of which I have to complete, or write, or rewrite myself besides editing. So that'll run to at least two years, maybe three, combined with time used for other for-pay work I'm doing and need not discuss. And by the way, I DO discuss jacket design, price, and release date with my authors, and I don't rewrite without permission from the author or his/her estate unless the book is over a hundred years old. I will NEVER treat any of my writers the way my publishers treated me. If you wish to comment, send to the address given in the Website. It's at _Live Oak House_ (http://hometown.aol.co.uk/utahliveoak/myhomepage/business.html) , also known as _http://hometown.aol.co.uk/utahliveoak/myhomepage/business.html_ (http://hometown.aol.co.uk/utahliveoak/myhomepage/business.html) . I'd have a better Website if I could afford it, but I can't, and this is the best I can get. I have quit reading _gutvol-d@lists.pglaf.org_ (mailto:gutvol-d@lists.pglaf.org) except for Michael's newsletters. If you wish to tell me how terribly unreasonable I am for expecting to be paid for my work (exclusive of volunteer and pro bono work, which I prefer NOT to be paid for), please go where it doesn't snow and leave me the h*** alone. I have yet to meet a landlord who quit charging his tenants rent as soon as he had paid for the construction costs of his rental property, or who then cut the rent back to only enough to cover maintenance and taxes. Oil well owners get a "depletion allowance" in their taxes because eventually the wells will run out of oil. The possibility of authors running out of brain-power is not considered. My grammarian is in the early stages of Alzheimer's and she's going to be writing her book as fast as she can while she can still think. In the highly unlikely event that any of you have something you want published for profit and lack the contacts to do it yourself, submit it to me in the usual fashion, and tell me you're part of PGLAF and I'll have a look at it. I just don't want to be barraged by manuscripts that should have been used for kitty litter, which happened some years ago when we were trying to do an e-publishing company. Anne Husband, seeing flock of birds seated side by side on electric wire: "Look, even the birds have gone online." -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050108/35bae7db/attachment-0001.html From cannona at fireantproductions.com Sat Jan 8 23:24:05 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sat Jan 8 23:24:53 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: References: <7a.6a5c69e0.2f120853@aol.com> Message-ID: <6.1.2.0.0.20050109012234.01b87ec0@mail.fireantproductions.com> Sorry I never got your discs to you. I found your original message, but I'm not sure what happened and why you didn't get your discs. If you would like me to send you a large quantity of discs, please e-mail me off list and let me know how many of each and which address to send them to. Thanks. Sincerely Aaron Cannon At 10:15 PM 1/8/2005, you wrote: >>PG offers CDs AND DVDs of its etext collections for FREE without you >>needing to pay a cent. This company, IMHO, is just trying to rip people >>off by making them pay for something you can get free of charge from PG >>just by asking. > > I've actually asked PG several times (via email) for copies of > the DVD so I can hand them out at my LUG[1] (I'm the president). It's > been a few months, and I haven't heard a single peep or reply to those > emails yet. > > That's partly why I'm mirroring it on my own here now. > > >David A. Desrosiers >desrod@gnu-designs.com >http://gnu-designs.com > > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > >!DSPAM:41e0ae6489551659916029! >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From gbnewby at pglaf.org Sun Jan 9 00:29:21 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Jan 9 00:29:22 2005 Subject: [gutvol-d] Dead, down, broken, missing, empty Gutenberg mirrors In-Reply-To: References: Message-ID: <20050109082921.GA26494@pglaf.org> On Sat, Jan 08, 2005 at 10:29:22PM -0500, David A. Desrosiers wrote: > > I just took a few minutes to check the mirror list (mostly > because the ibiblio rsync mirror is now so slow, a 9600 modem would be > faster at sending bits across). > > Here's what I found: Thanks, David. I'll check these again in a few days, in case these errors are transient, then remove the broken ones. I think the .cn one is inaccessible except from within China, intentionally. -- Greg > d == domain down/dead > a == access required > r == connection refused > e == empty collection > w == wrong path (for ftp mirrors) > 4 == 404 (for http mirrors) > l == limit reached > > [d] ftp://ftpbook.dhs.org/mirrors/gutenberg/ > [a] ftp://elib.phil.pku.edu.cn/pub/gutenberg/ > [r] ftp://dlib.eramisp.com/gut/ > [e] http://gutenberg.kk.dk/ > [w] ftp://www.artfiles.org/gutenberg.org/ > [e] ftp://ftp.iif.hu/pub/gutenberg/ > [4] http://gutenberg.unipmn.it/mirror/ > [l] > ftp://ftp.mirrorservice.org/sites/metalab.unc.edu/pub/docs/books/gutenberg/ > [w] ftp://ftp.samurai.com/pub/gutenberg/ > [d] http://www.atexiansattic.com/www2/gutenberg/ > [e] http://www.spiritdancers.org/pg/ > > Maybe this will be useful to others who maintain that SQL dump > list. > > > > David A. Desrosiers > desrod@gnu-designs.com > http://gnu-designs.com > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From brett at dimetrodon.demon.co.uk Sun Jan 9 02:35:00 2005 From: brett at dimetrodon.demon.co.uk (Brett Paul Dunbar) Date: Sun Jan 9 02:35:43 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <20050108214753.GA7234@clipper.ens.fr> References: <110.40a721d2.2f0f7075@aol.com> <20050108214753.GA7234@clipper.ens.fr> Message-ID: Sebastien Blondeel writes > >2/ very few people live with their music > >How many French people make their living with the income they get from >the mere sales of their records? books? (and derived products) out of 60 >million people? I would say just a few hundreds, maybe thousands. A >little more if you kick in people living with concerts but then this >looks more like a "real job". Most books are written by people who have >a "normal" job on the side. Most music bands people have a job on the >side. Whilst it may be true that most books are written by part-time writers, it is also true that hardly anyone reads most books. I expect that most books read are written by full-time writers, or writers who could be full-time if they chose to be. e.g. books by Terry Pratchett make up about 1% of UK fiction sales, the impact on the supply of good fiction if he still had to have a day-job would be significant, as he would not be able to write nearly as many books. > >So less copyright "protection" would not mean less creation. And this >"support the artist" idea is bogus: 1/ most artists don't benefit from >it Most artists don't benefit from it because most of them just aren't much good, so few people read their books. >2/ don't deprive millions for the sake a a few tens Don't deprive millions of readers for the sake of a relatively small number who want to read poor quality older books. The really good stuff, the stuff people are actually interested in reading, tends to stay in print. >3/ creation would >go on any way. You would get less of the good stuff, the stuff people actually consider worth paying for, if the small minority of really good writers needed a day-job and couldn't write full-time. -- Great Internet Mersenne Prime Search http://www.mersenne.org/prime.htm Brett Paul Dunbar To email me, use reply-to address From traverso at dm.unipi.it Sun Jan 9 02:59:08 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sun Jan 9 02:56:23 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: (message from Brett Paul Dunbar on Sun, 9 Jan 2005 10:35:00 +0000) References: <110.40a721d2.2f0f7075@aol.com> <20050108214753.GA7234@clipper.ens.fr> Message-ID: <200501091059.j09Ax8N12669@posso.dm.unipi.it> >>>>> "Brett" == Brett Paul Dunbar writes: Brett> Sebastien Blondeel writes >> 2/ very few people live with their music >> >> How many French people make their living with the income they >> get from the mere sales of their records? books? (and derived >> products) out of 60 million people? I would say just a few >> hundreds, maybe thousands. A little more if you kick in people >> living with concerts but then this looks more like a "real >> job". Most books are written by people who have a "normal" job >> on the side. Most music bands people have a job on the side. Brett> Whilst it may be true that most books are written by Brett> part-time writers, it is also true that hardly anyone reads Brett> most books. I expect that most books read are written by Brett> full-time writers, or writers who could be full-time if Brett> they chose to be. e.g. books by Terry Pratchett make up Brett> about 1% of UK fiction sales, the impact on the supply of Brett> good fiction if he still had to have a day-job would be Brett> significant, as he would not be able to write nearly as Brett> many books. >> So less copyright "protection" would not mean less >> creation. And this "support the artist" idea is bogus: 1/ most >> artists don't benefit from it Brett> Most artists don't benefit from it because most of them Brett> just aren't much good, so few people read their books. >> 2/ don't deprive millions for the sake a a few tens Brett> Don't deprive millions of readers for the sake of a Brett> relatively small number who want to read poor quality older Brett> books. The really good stuff, the stuff people are actually Brett> interested in reading, tends to stay in print. Those artists will probably make more than enough money in the fist few years (ten?) and should be encouraged to publish more books just not giving them any more money after the initial period. If the copyright terms lasted just a short period, probably we would have more novels of J.D. Salinger. In this case, long term copyrights have served just the opposite of encouraging more literary production. Carlo Traverso From miranda_vandeheijning at blueyonder.co.uk Sun Jan 9 04:53:14 2005 From: miranda_vandeheijning at blueyonder.co.uk (Miranda van de Heijning) Date: Sun Jan 9 04:54:19 2005 Subject: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Released Message-ID: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Perhaps I don't understand Moore's Law properly, but aren't we actually well behind on its schedule? If we published our 10,000th book at the end of 2003, doubling in 1.5 years means 20,000 books in mid-2005. At the current rate we are not likely to reach that figure--anywhere between 16k and 17k seems more realistic. It's not something to be ashamed of, as we are still extending the archive at a very respectable rate, so can we not just be honest about it and admit that we while we predicted 20,000 based on Moore's Law, we are no longer growing that quickly. I've always understood Moore's Law to be just a prediction, not a target, so we haven't failed anything. :-) Other than that, congratulations on reaching 15,000! What was posted as eBook 15000? To: "Project Gutenberg Weekly Newsletter" Sent: Saturday, January 08, 2005 8:00 PM Subject: [gweekly] 15,000th Project Gutenberg eBook Released > > Congratulations to the Project Gutenberg Volunteers!!! > > > In the last hour Project Gutenberg released their 15,000th eBook. > > More details will be available in Wednesday's email Newsletters. > > This far exceeds Moore's Law projections from 10 eBooks in 1990, > which would predict 15,000 around August, 2006, and which every > pundit has continually said was an impossible growth rate: > > Projected Growth Rate > > Total Date Doubled Years > > 10 Dec, 1990 0 0 > 20 Jun, 1992 1 1.5 > 40 Dec, 1993 2 3 > 80 Jun, 1995 3 4.5 > 160 Dec, 1996 4 6 > 320 Jun, 1998 5 7.5 > 640 Dec, 1999 6 9 > 1280 Jun, 2001 7 10.5 > 2560 Dec, 2002 8 12 > 5120 Jun, 2004 9 13.5 > 10240 Dec, 2005 10 15 > 15000 Aug, 2006 10.5 15+ <<< Predicted Date for ~15,000 > 20480 Jun, 2007 11 16.5 > > > Our many thanks to all the thousands of Gutenberg volunteers!!! > > > Michael S. Hart > > _______________________________________________ > gweekly mailing list > gweekly@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gweekly > > From holden.mcgroin at dsl.pipex.com Sun Jan 9 05:19:33 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Sun Jan 9 05:19:39 2005 Subject: [gutvol-d] Is this a Project Gutenberg violation? In-Reply-To: <24.67fa9e97.2f120aeb@aol.com> References: <24.67fa9e97.2f120aeb@aol.com> Message-ID: <41E12F65.9040904@dsl.pipex.com> JBuck814366460@aol.com wrote: > I usually prefer either .txt or .html myself, LOL. PDF is also good > (the new version coming out soon will offer Linux support, so I have > heard. I run Red Hat Linux 9 on my machine in a dual-boot environment > (alongside Wincrap XP) and will shortly upgrade RH9 to Fedora Core 3 > once I finish downloading and burning the ISOs to CD-R next week. I > LOVE my new DSL connection, it's very handy ;) Hi! I believe Acrobat Reader is already available for Linux - It's definitely available on the Suse Linux Professional DVD. If you'd prefer a more "free" programme to read PDF files, there's GPDF which comes as part of Fedora Core 3: http://www.inf.tu-dresden.de/~mk793652/gpdf/ You'll also find packages for many different flavours of Linux available here: http://rpm.pbone.net/ Enjoy :-) Holden From holden.mcgroin at dsl.pipex.com Sun Jan 9 05:45:44 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Sun Jan 9 05:45:52 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: References: <110.40a721d2.2f0f7075@aol.com> <20050108214753.GA7234@clipper.ens.fr> Message-ID: <41E13588.6000200@dsl.pipex.com> Brett Paul Dunbar wrote: > Sebastien Blondeel writes > > Whilst it may be true that most books are written by part-time writers, > it is also true that hardly anyone reads most books. If you look at it on sales per copy, that's fine. However, please take a look at this article from Wired called "The Long Tail" which has many statistics on how the sales of all those hundreds of thousands of "unpopular" books actually match up with sales of the most popular books. There may only be a few readers of each but there's significant profit made (by some people, at least) on all those unpopular books: http://www.wired.com/wired/archive/12.10/tail.html > I expect that most > books read are written by full-time writers, or writers who could be > full-time if they chose to be. e.g. books by Terry Pratchett make up > about 1% of UK fiction sales, the impact on the supply of good fiction > if he still had to have a day-job would be significant, as he would not > be able to write nearly as many books. Here you make the most erroneous assumption: that less restrictive copyright laws would force Terry Pratchett to get a day job. Assuming Copyright was as it was in the U.S. Constitution (14 years with an optional extension for a total of 28), Terry Pratchett would still be earning money from his last 28 years of writing. Take this in the context that he started writing "full time" in 1987. >> So less copyright "protection" would not mean less creation. And this >> "support the artist" idea is bogus: 1/ most artists don't benefit from >> it > > Most artists don't benefit from it because most of them just aren't much > good, so few people read their books. Wait, wait, wait. The artistic merits of a book have very little to do with how many copies they sell. > Don't deprive millions of readers for the sake of a relatively small > number who want to read poor quality older books. The really good stuff, > the stuff people are actually interested in reading, tends to stay in > print. I see, now it's "older" books that're of poor quality. Why is this? Is literature undergoing some miraculous transformation which causes all older books to be of poorer quality? With less restrictive copyright laws in place (I suggest 14 years + 14 years), all authors capable of selling enough copies of a book for them to make a living out of, would still be able to do it. Copyright was not created (in the U.S.) in order to ensure a never-ending payday for authors (and, more importantly, publishers!) -- it was created to encourage more creation than would otherwise happen. Going back to Anne's analogy, if you're a plumber and you do one year of amazingly good work, you're paid for one year. Authors, on the other hand, seem to think that for one year of work, they should be paid until seventy years after the end of their lives. I know I'm not the only person who think that's unreasonable. Cheers, Holden From holden.mcgroin at dsl.pipex.com Sun Jan 9 05:53:58 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Sun Jan 9 05:54:05 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: <41E13776.6030304@dsl.pipex.com> Miranda van de Heijning wrote: > Perhaps I don't understand Moore's Law properly, but aren't we actually well > behind on its schedule? If we published our 10,000th book at the end of > 2003, doubling in 1.5 years means 20,000 books in mid-2005. At the current > rate we are not likely to reach that figure--anywhere between 16k and 17k > seems more realistic. > > It's not something to be ashamed of, as we are still extending the archive > at a very respectable rate, so can we not just be honest about it and admit > that we while we predicted 20,000 based on Moore's Law, we are no longer > growing that quickly. I've always understood Moore's Law to be just a > prediction, not a target, so we haven't failed anything. :-) > > Other than that, congratulations on reaching 15,000! What was posted as > eBook 15000? Recently, we haven't matched Moore's law's growth rates, not that there's ANYTHING AT ALL wrong with that - Project Gutenberg is still growing at its fastest rate ever. However, if you look over the entire history of PG, you'll see that on average, we're quite ahead of Moore's law. If our production grows at less than the Moore's law rate for much longer, then our average growth rate may slip below that predicted by Moore. It's a nice statistic while it lasts but I think everybody at PG know what it's all about: producing e-books not just quickly but WELL. At some point in the future, PG may produce less books than, say, the Million Book Project. However, if you compare PG's double/triple proofread texts against the Million Book Project's unproofed OCR from quickly-done scans, I think people will see where the true value lies. Cheers, Holden From hart at pglaf.org Sun Jan 9 08:45:25 2005 From: hart at pglaf.org (Michael Hart) Date: Sun Jan 9 08:45:27 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: On Sun, 9 Jan 2005, Miranda van de Heijning wrote: > > Perhaps I don't understand Moore's Law properly, but aren't we actually well > behind on its schedule? If we published our 10,000th book at the end of > 2003, doubling in 1.5 years means 20,000 books in mid-2005. At the current > rate we are not likely to reach that figure--anywhere between 16k and 17k > seems more realistic. When you've been watching a growth curve for a long time, the changes don't seem as drastic as when you watch for a short time. Any growth rate prediction has to have a starting point, we have always used 1990, though we could restart with other years, and, yes, given that there are ups and downs in any real growth curve, you could always pick either the highs or lows [as they do in government statistics] and then skew the results massively as they change exponentially. Since we grew so rapidly in some periods, sometimes in excess of TWICE the Moore's Law predicitons, you could always start with those years as a baseline to demonstrate that we are no longer growing at TWICE the Moore's Law rate. However, 1990 is the first year we actually made totally consistent additions every month to the collection, so it's the best start point. When you map things out over very long periods, all the bumps in the road seem to flatten out. . .I can send you a graph from 1990 to 2005 if you like, so you can see that unless you use a much larger graph than I can include in this format, it looks very smooth, even though it switches from a resolution of years to months at the top. mh From gbuchana at rogers.com Sun Jan 9 10:02:28 2005 From: gbuchana at rogers.com (Gardner Buchanan) Date: Sun Jan 9 10:02:39 2005 Subject: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Re In-Reply-To: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: Hi Miranda, On 12:53:14 Miranda van de Heijning wrote: > > > Other than that, congratulations on reaching 15,000! What was posted as > eBook 15000? > As near as I can make out, we are not there yet. 14639 is the highest number I can see and it was posted just now: 9-jan-2005. ============================================================ Gardner Buchanan Ottawa, ON FreeBSD: Where you want to go. Today. From jmdyck at ibiblio.org Sun Jan 9 12:15:52 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Sun Jan 9 12:16:24 2005 Subject: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Re References: Message-ID: <41E190F8.55017770@ibiblio.org> Gardner Buchanan wrote: > > On 12:53:14 Miranda van de Heijning wrote: > > > > Other than that, congratulations on reaching 15,000! > > What was posted as eBook 15000? > > As near as I can make out, we are not there yet. 14639 is the > highest number I can see and it was posted just now: 9-jan-2005. I suspect Michael Hart has included the PG of Australia collection (400 on Wednesday) in his grand total. -Michael From marcello at perathoner.de Sun Jan 9 10:15:55 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jan 9 13:17:26 2005 Subject: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <41E0900A.8090100@adelaide.edu.au> References: <20050106144053.710304F4C5@ws6-5.us4.outblaze.com> <41DD5CED.8060605@srv.net> <41DD6791.7020200@srv.net> <41DDD332.3070408@adelaide.edu.au> <41DDEBCD.6050103@adelaide.edu.au> <41DF22FB.8070806@perathoner.de> <41E0900A.8090100@adelaide.edu.au> Message-ID: <41E174DB.1000009@perathoner.de> Steve Thomas wrote: > That's great, exactly what I wanted -- and I see that napol10.zip also > takes you to the catalog page. So I assume you are redirecting > napol10.* Would it be possible to also redirect napol* -- so that you'd > get the catalog page regardless of version? That would mean that anyone > with an old link would then discover thru the catalog that there was a > later version. I can redirect any file that was in the old directories, provided I can get a mapping filename => etext-no. Some of these mappings have been deleted after the files were moved because I didn't think of this gimmick. The mappings of the files recently REPosted are still in the database. I'll try to get the old ones back from a backup, if there is one. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Sun Jan 9 13:11:42 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jan 9 13:17:33 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: <41E19E0E.6000307@perathoner.de> Michael Hart wrote: >>This far exceeds Moore's Law projections from 10 eBooks in 1990, >>which would predict 15,000 around August, 2006, and which every >>pundit has continually said was an impossible growth rate: >> >> Projected Growth Rate >> >>Total Date Doubled Years >> >> 10 Dec, 1990 0 0 >> 20 Jun, 1992 1 1.5 >> 40 Dec, 1993 2 3 >> 80 Jun, 1995 3 4.5 >> 160 Dec, 1996 4 6 >> 320 Jun, 1998 5 7.5 >> 640 Dec, 1999 6 9 >> 1280 Jun, 2001 7 10.5 >> 2560 Dec, 2002 8 12 >> 5120 Jun, 2004 9 13.5 >>10240 Dec, 2005 10 15 >>15000 Aug, 2006 10.5 15+ <<< Predicted Date for ~15,000 >>20480 Jun, 2007 11 16.5 Bzzzzt, wrong. But thank you for playing! You tried to show that the number of books in the collection obeys Moore's Law. Moore's Law tries to fit the data to an 2 ^ t exponential curve with a doubling rate of 1.5 years. In that case we have: you started in 1971 and we have reached 10.000 books by the end of 2003. That's roughly 33 years for 10000 books. With 33 years and 10000 books we get: x * 2 ^ (33 / 1.5) = 10000 and we solve: x = 0.002384 A year later than book 10.000 we should have gotten to: 0.002384 * 2 ^ (34 / 1.5) = 15873 which we have failed to do. We should get to #20.000 a year and a half after #10.000. That would be May 2005. So much for Moore's Law, which, by the way, doesn't work well in computer science either, but is for some strange reason one of the most-cited "Laws". I'll attach a plot of the function: 0.002384 * 2 ^ ((x - 1971) / 1.5) starting at x = 2000 and ending at x = 2008. That is, if the attachement comes thru. Otherwise use Gnuplot to plot it yourself. -- Marcello Perathoner webmaster@gutenberg.org -------------- next part -------------- A non-text attachment was scrubbed... Name: plot.png Type: image/png Size: 5068 bytes Desc: not available Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050109/811931ad/plot.png From marcello at perathoner.de Sun Jan 9 13:31:23 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jan 9 13:38:07 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: <41E1A2AB.4070108@perathoner.de> Michael Hart wrote: > However, 1990 is the first year we actually made totally consistent > additions every month to the collection, so it's the best start point. So you pick an arbitrary starting point to make the math come out right. Why not choose 1971 as starting point and accept where the math gets you: we are *behind* Moore's Law. -- Marcello Perathoner webmaster@gutenberg.org From jmdyck at ibiblio.org Sun Jan 9 13:26:44 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Sun Jan 9 13:47:35 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: <41E1A194.3A3DFEF9@ibiblio.org> Miranda van de Heijning wrote: > > Perhaps I don't understand Moore's Law properly, but aren't we > actually well behind on its schedule? It depends on what what you use as your reference point. E.g.: 10000 in Oct 2003 predicts ~17,000 now, and we're slightly behind; 1000 in Oct 1997 predicts ~30,000 now, and we're way behind; 100 in Dec 1993 predicts ~17,000 now, and we're slightly behind; 10 in Dec 1990 predicts ~7,000 now, and we're way ahead. The earliest reference I can find to Moore's Law in relation to PG's growth rate is in the Nov 27, 2002 weekly newsletter: http://www.gutenberg.net/newsletter/archive/PGWeekly_2002_11_27.txt in which Michael Hart uses '100 in Dec 1993' as the reference point. PG's total stayed remarkably close to that model (maybe only 1 or 2% above it) for all of 2002 and 2003, but started falling away from it in early 2004. In the last 6 months, the PG total has increased by about 14%, from 13,155 on 2004-07-07 to 15,000 on 2005-01-08, which puts it close to a 'doubling every 2.6 years' curve. -Michael From tim at tmeekins.com Sun Jan 9 14:27:06 2005 From: tim at tmeekins.com (Tim Meekins) Date: Sun Jan 9 14:27:22 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1A194.3A3DFEF9@ibiblio.org> Message-ID: <027201c4f69a$5ac605a0$3201a8c0@pink> >From the last newsletter, I think this is the most telling stat: 338 Average Per Month in 2004 355 Average Per Month in 2003 203 Average Per Month in 2002 103 Average Per Month in 2001 4049 New eBooks in 2004 4164 New eBooks in 2003 2441 New eBooks in 2002 1240 New eBooks in 2001 We've done FEWER books in 2004 than in 2003... At that rate, I don't see how we could be keeping up with Moore's Law. We did pretty good from 2001 to 2003, but we've started to plateau, if not slide back a bit. I'm sure we will see much more growth, but at a steady Moore's Law curve, I'm not so sure. From j.hagerson at comcast.net Sun Jan 9 14:42:48 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Sun Jan 9 14:43:01 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <027201c4f69a$5ac605a0$3201a8c0@pink> Message-ID: <000101c4f69c$8ed21440$6401a8c0@sarek> I assert that the 2004 numbers are lower for a number of reasons (not necessarily in order of importance): 1. We artificially divide one printed work into multiple eBooks much less often than we used to. 2. We are much less likely to post an HTML (or .lit, or .doc) copy of a work under a separate eBook number. 3. We are taking more time to make sure that every book we post is of very high quality. All of these factors make it harder to keep up with Moore's "Law". I think that we should celebrate what we have done and what we are doing and not fret that we aren't "doubling our output every eighteen months." John Hagerson -----Original Message----- From: gutvol-d-bounces@lists.pglaf.org [mailto:gutvol-d-bounces@lists.pglaf.org] On Behalf Of Tim Meekins Sent: Sunday, January 09, 2005 4:27 PM To: Project Gutenberg Volunteer Discussion Subject: Re: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Released >From the last newsletter, I think this is the most telling stat: 338 Average Per Month in 2004 355 Average Per Month in 2003 203 Average Per Month in 2002 103 Average Per Month in 2001 4049 New eBooks in 2004 4164 New eBooks in 2003 2441 New eBooks in 2002 1240 New eBooks in 2001 We've done FEWER books in 2004 than in 2003... At that rate, I don't see how we could be keeping up with Moore's Law. We did pretty good from 2001 to 2003, but we've started to plateau, if not slide back a bit. I'm sure we will see much more growth, but at a steady Moore's Law curve, I'm not so sure. _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From sharris at steveharris.net Sun Jan 9 14:55:28 2005 From: sharris at steveharris.net (steve harris) Date: Sun Jan 9 14:53:20 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <027201c4f69a$5ac605a0$3201a8c0@pink> Message-ID: Folks - I agree with Tim M that the fact that our output is stagnant is very telling; its much more important than whether we are complying with Moore's Law, Megan's Law or Murphy's Law. While one important issue is the post-proofing bottleneck in DP (which is being given attention), as important but more fundamental is whether the PG project/organization/effort is positioned for growth. In my view, chugging along at 5-10K per year is very nice, but will be increasingly marginalized by other efforts (whether Google-wise or otherwise). It also raises a difficult issue: If we are going to do a significant chunk of the public domain in a reasonably short period, it probably doesn't matter in what order we do the books. If we are only going to get to 50K over the next 5 years or to 100K over the next 10 years, we should probably give some thought to where we should put our efforts (e.g. what is the relative value of "The Yale Shakespeare" on top of the several versions of each play we already have?) Steve Harris pg@steveharris.net > -----Original Message----- > From: gutvol-d-bounces@lists.pglaf.org > [mailto:gutvol-d-bounces@lists.pglaf.org] On Behalf Of Tim Meekins > Sent: Sunday, January 09, 2005 2:27 PM > To: Project Gutenberg Volunteer Discussion > Subject: Re: [gutvol-d] Fw: [gweekly] 15,000th Project > Gutenberg eBook Released > > > >From the last newsletter, I think this is the most telling stat: > > 338 Average Per Month in 2004 > 355 Average Per Month in 2003 > 203 Average Per Month in 2002 > 103 Average Per Month in 2001 > > 4049 New eBooks in 2004 > 4164 New eBooks in 2003 > 2441 New eBooks in 2002 > 1240 New eBooks in 2001 > > We've done FEWER books in 2004 than in 2003... At that rate, > I don't see how > we could be keeping up with Moore's Law. We did pretty good > from 2001 to > 2003, but we've started to plateau, if not slide back a bit. > I'm sure we > will see much more growth, but at a steady Moore's Law curve, > I'm not so > sure. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d > From j.hagerson at comcast.net Sun Jan 9 15:32:57 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Sun Jan 9 15:33:11 2005 Subject: FW: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released Message-ID: <000301c4f6a3$90236f40$6401a8c0@sarek> Steve Harris said: It also raises a difficult issue: If we are going to do a significant chunk of the public domain in a reasonably short period, it probably doesn't matter in what order we do the books. If we are only going to get to 50K over the next 5 years or to 100K over the next 10 years, we should probably give some thought to where we should put our efforts (e.g. what is the relative value of "The Yale Shakespeare" on top of the several versions of each play we already have?) We are all volunteers. You can lead volunteers by inspiring them; an effort to "herd" volunteers by dictating what they can and cannot do can easily cause them to scatter (or disappear). If you would care to share your grand vision, please do -- I could use a dose of inspiration today. Cloudy in Chicago, John Hagerson From stephen.thomas at adelaide.edu.au Sun Jan 9 15:43:02 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Sun Jan 9 15:43:16 2005 Subject: FW: [gutvol-d] Project Gutenberg Original Directory Structure In-Reply-To: <000801c4f5f7$f8b9cd30$6401a8c0@sarek> References: <000801c4f5f7$f8b9cd30$6401a8c0@sarek> Message-ID: <41E1C186.9080609@adelaide.edu.au> That wouldn't matter -- you'd be redirecting etextXX/xxxxx* so having the same name used in different directories won't matter. I vaguely recall tripping over one instance where a file had a four character name instead of the usual five -- was it "moby" -- and this collided with a five-character name in the same year. But I think that was only a single instance. Steve John Hagerson wrote: > I believe that there would be an issue redirecting some links of the type > napol* because we have the same book name in two different ebook* > directories for two different eBooks. > > Steve Thomas wrote: > > That's great, exactly what I wanted -- and I see that napol10.zip also > takes you to the catalog page. So I assume you are redirecting > napol10.* Would it be possible to also redirect napol* -- so that you'd > get the catalog page regardless of version? That would mean that anyone > with an old link would then discover thru the catalog that there was a > later version. > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From vze3rknp at verizon.net Sun Jan 9 15:50:39 2005 From: vze3rknp at verizon.net (Juliet Sutherland) Date: Sun Jan 9 15:50:33 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <027201c4f69a$5ac605a0$3201a8c0@pink> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1A194.3A3DFEF9@ibiblio.org> <027201c4f69a$5ac605a0$3201a8c0@pink> Message-ID: <41E1C34F.4060600@verizon.net> 2003 saw the posting of a lot of audio books and also several versions of the Bible in multiple sections. 2004 did not have any of those. In terms of actual new ebooks, 2004 is well ahead of 2003. JulietS Tim Meekins wrote: >> From the last newsletter, I think this is the most telling stat: > > > 338 Average Per Month in 2004 > 355 Average Per Month in 2003 > 203 Average Per Month in 2002 > 103 Average Per Month in 2001 > > 4049 New eBooks in 2004 > 4164 New eBooks in 2003 > 2441 New eBooks in 2002 > 1240 New eBooks in 2001 > > We've done FEWER books in 2004 than in 2003... At that rate, I don't > see how we could be keeping up with Moore's Law. We did pretty good > from 2001 to 2003, but we've started to plateau, if not slide back a > bit. I'm sure we will see much more growth, but at a steady Moore's > Law curve, I'm not so sure. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > From stephen.thomas at adelaide.edu.au Sun Jan 9 16:45:39 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Sun Jan 9 16:45:56 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> Message-ID: <41E1D033.2060002@adelaide.edu.au> Moore's Law actually has nothing to do with ebooks: "Moore's law is an empirical observation stating, in effect, that at our rate of technological development and advances in the semiconductor industry, the complexity of integrated circuits doubles every 18 months." (From Wikipedia.) Michael Hart likes to compare PG growth with Moore's Law, but nobody should place too much importance on this. The important thing is that PG continues to grow, and that it continues to develop. Steve Miranda van de Heijning wrote: > Perhaps I don't understand Moore's Law properly, but aren't we actually well > behind on its schedule? If we published our 10,000th book at the end of > 2003, doubling in 1.5 years means 20,000 books in mid-2005. At the current > rate we are not likely to reach that figure--anywhere between 16k and 17k > seems more realistic. -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From blondeel at clipper.ens.fr Mon Jan 10 02:12:13 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Mon Jan 10 02:12:35 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <41E1C34F.4060600@verizon.net> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1A194.3A3DFEF9@ibiblio.org> <027201c4f69a$5ac605a0$3201a8c0@pink> <41E1C34F.4060600@verizon.net> Message-ID: <20050110101213.GB20581@clipper.ens.fr> On Sun, Jan 09, 2005 at 06:50:39PM -0500, Juliet Sutherland wrote: > 2003 saw the posting of a lot of audio books and also several versions > of the Bible in multiple sections. 2004 did not have any of those. In > terms of actual new ebooks, 2004 is well ahead of 2003. Alternative Stat: "Real e-Books" -------------------------------- Does the figure of "actual new ebooks" exist? If not, why not consider create it? Wikipedia too has both an "official" and a "real (> 200 ch) article count: http://en.wikipedia.org/wikistats/EN/TablesWikipediaEN.htm Request For New Stat: Human Resources and Work ---------------------------------------------- Moore Law has little chance to work here: there is no real technological progress going on (PGDP, gutcheck etc. help, but there are far from the ongoing technological progress going on since the 1950's in computers). It all boils down to human resources. Thos only expand exponentially in pyramid schemes. Granted, we still have some room for that... This is the reason why I believe an interesting figure would be the number of volunteers (PGDP active people would be a nice stat) and the amount of work they do (PGDP proofed pages would be a nice stat, not impeded by the PP bottleneck mentioned previously in this thread: if in 2005 PGDP volunteers proof 100 million pages but PGDP PP is still dripping out slowly, the official PG stats won't show what is really going on). Suggestion For Improvements: Work on PG and PGDPs's Home Pages -------------------------------------------------------------- PG decided to go public at the 10,000th e-book. I would like it to be more successful, famous, and have more volunteers. The ebooksgratuits.com webmaster has a very active group of people doing e-books in Word (I'm working on a filter to help them transform that in PG- acceptable formats, such as TXT or XHTML). He thinks a BIG reason why PG and PGDPs are not successful is the fact that the websites are not clear, not sexy, etc. You can have a look at his site or ask him for details to know what he means. I mention that here because I am not sure this issue is know amongs the volunteers (I'm new enough around here, maybe this has already been addressed). From blondeel at clipper.ens.fr Mon Jan 10 04:12:11 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Mon Jan 10 04:12:36 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released] In-Reply-To: References: <027201c4f69a$5ac605a0$3201a8c0@pink> Message-ID: <20050110121211.GA25054@clipper.ens.fr> On Sun, Jan 09, 2005 at 02:55:28PM -0800, steve harris wrote: > While one important issue is the post-proofing bottleneck in DP (which > is being given attention), as important but more fundamental is whether Can you give details about this bottleneck issue? > the PG project/organization/effort is positioned for growth. > It also raises a difficult issue: If we are going to do a significant > chunk of the public domain in a reasonably short period, it probably > doesn't matter in what order we do the books. You are touching here the problem of the (lack of) editorial policy of PG / PGDP. I tried to "centralize" the list of French books being worked on (or finished) on PG, both PGDP, and ebooksgratuits.com: http://www.eleves.ens.fr/home/blondeel/PGDP/catalog/ A friend of mine studying literature "promised" to give me "the list of all French books ever, sorted by descending importance" when she could ask her professor --- or anything "close" to that, because of course this list is impossible to make (even without an "importante" rating). For Halloween, some French-language PGDP PMs just grep'ed the string "fantome" (French for "ghost") in Gallica[*] (a big public website with many scans of books, and those guys agreed for PGDP to use their images). [*] http://gallica.bnf.fr/ The ebooksgratuits.com people sometimes work on all the books of a given author. All this makes for a not very coherent, consistent editorial policy. I guess literature people can easily criticize the PG French catalog (some very obscure books, and some blatant misses). Of course the obvious answer would be "stop whining and do it yourself then!" but those people just don't work this way (think "psychology"). They're not hackers, they don't have this culture of "let's get involved, roll up our sleeves and change the world", but still they could be useful to PG. When I proof pages in PGDP, I usually work on the oldest book sitting around. It's both a feeling of "duty" and it makes me discover things I wouldn't have without that. So the time I spend on obscure books is not spent on more "important" ones. On the other hand, it would be difficult to set up an official editorial board: of course it should not be too bureaucratic and complicated, of course it should not have a monopoly of the books proposed to PGDP (PMs would still be free to kick in books they just like, keeping in mind they will delay the more "important" books. We work in limited resources, so we should define priorities). But above all we are missing the competent people: I guess a bunch of University professors specialized in pre-XXth century literature, history, philosophy etc. would do, but how many of those know PG? (If you don't like scholars because they tend to be non pragmatic and argue about pointless details, replace that with: essay writers, journalists, whoever is important in the "culture" of the language considered). Has PG achieved any kind of fruitful collaboration with scholars? I could use my ink-jet printer and phone diary and send a mailing to random French literature professors I would find, but that would not look very credible to them (no matter how important and nice PG is anyway). References would help to make a bootstrap, then these people could relay the information between themselves and to their students (all master thesis dealing with old books could give PG their stuff, etc.; students could work on some books in the PGDP's, etc.). Plus having some "official title" (like "PG French editorial board") in something that looks important and more and more important with time ("PG") could maybe give them the incentive to help us a little. From maitriv at yahoo.com Mon Jan 10 07:41:14 2005 From: maitriv at yahoo.com (maitri venkat-ramani) Date: Mon Jan 10 07:41:20 2005 Subject: [gutvol-d] Murphy's Law In-Reply-To: <20050107022256.GB2396@clipper.ens.fr> Message-ID: <20050110154114.38062.qmail@web52307.mail.yahoo.com> I don't see the usefulness of comparing our growth to Murphy's Law as even recent advances in computing is rendering this Law obsolete. Take my world of geological visualization, for instance. As explained by one of the specialists at Landmark Graphics, "With Linux 64-bit, you can have an unlimited volume of RAM on the system. The historical limit for the 32-bit system was around 2GB. Now, we're seeing systems that have 16GB of memory. No longer does compute power double every 18 months at the same price ... when things scale by 10 times, it's no longer just faster - you're in a different world." Granted, my system is not going to run at full spec speed at any given time owing to bandwidth quirks, etc., but it sure outdoes Murphy's Law nowadays. If we need something to which to compare our growth rates, we should define our own new baseline curve and see how our growth proceeds from there. Since I personally don't know how to create one of these benchmarks, I will stop here and let someone else recommend. Maitri __________________________________ Do you Yahoo!? All your favorites on one personal page – Try My Yahoo! http://my.yahoo.com From Gutenberg9443 at aol.com Mon Jan 10 08:39:39 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Mon Jan 10 08:39:49 2005 Subject: [gutvol-d] Fwd: Check out City Journal Autumn 2004 | The Classics in the Slums by Jonath... Message-ID: Skipped content of type multipart/alternative-------------- next part -------------- An embedded message was scrubbed... From: Utahpindar@aol.com Subject: Check out City Journal Autumn 2004 | The Classics in the Slums by Jonathan Ro Date: Mon, 10 Jan 2005 10:47:07 EST Size: 1597 Url: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050110/e92927c3/attachment-0001.mht From ag737 at freenet.carleton.ca Mon Jan 10 08:44:57 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Mon Jan 10 08:45:06 2005 Subject: [gutvol-d] National Web library do-able, affordable, visionary Message-ID: <169faf16c41d.16c41d169faf@ncf.ca> Full text posted here with Prof. Geist's kind permission: Toronto Star 2005.01.10 D03 Michael Geist Law Bytes --------------------------------------------------- National Web library do-able, affordable, visionary --------------------------------------------------- In the mid-1990s, Ottawa established a bold new vision for the Internet in Canada. The centrepiece was a commitment to establish national Internet access from coast to coast to coast, supported by a program that would enable the country to quickly become the first in the world to connect every single school, no matter how small or large, to the Internet. Not only did Canada meet its goal, but it completed the program ahead of schedule. As we enter the middle of this decade, the time has come for Industry Minister David Emerson and his colleagues to articulate a new future- oriented vision for the Canadian Internet. While the last decade centred on access to the Internet, the dominant issue this decade is focused on access to the content on the Internet. To address that issue, the federal government should again think big. One opportunity is to greatly expand the National Library of Canada's digital efforts by becoming the first country in the world to create a comprehensive national digital library. The library, which would be fully accessible online, would contain a digitally scanned copy of every book, government report, and legal decision ever published in Canada. A national digital library would provide unparalleled access to Canadian content in English and French along with aboriginal and heritage languages such as Yiddish and Ukrainian. The library would serve as a focal point for the Internet in Canada, providing an invaluable resource to the education system and ensuring that access to knowledge is available to everyone, regardless of economic status or geographic location. >From a cultural perspective, the library would establish an exceptional vehicle for promoting Canadian creativity to the world, leading to greater awareness of Canadian literature, science, and history. By extending the library to government documents and court decisions, it would help meet the broader societal goal of providing all Canadians with open access to their laws and government policies. Moreover, since the government holds the copyright associated with its own reports and legal decisions, it is able to grant complete, unrestricted access to all such materials immediately alongside the approximately 100,000 Canadian books that are already part of the public domain. Creating virtual libraries to complement the world's great physical libraries is already underway. Project Gutenberg, an all-volunteer initiative, has succeeded in bringing thousands of public domain texts to the Web. Last summer, the British Library unveiled an ambitious plan to digitize and freely post on the Internet thousands of historical newspapers that are now in the public domain. That plan will bring more than one million pages of history to the Internet, including work from a young Charles Dickens. Last month Google announced that it had reached agreement with several of the world's leading research libraries, including ones at Harvard, Stanford, Michigan, Oxford, and the New York Public Library, to scan more than 15 million books into its search archive. Once the Google project is completed, the general public will enjoy complete, full-text access to thousands of books that are now part of the public domain because the term of copyright associated with those books has expired. For books that remain subject to copyright, Google will still scan a copy of the book, but will only grant the general public more modest access to its content, providing users with smaller excerpts of the work - a policy that is consistent with principles of fair use under copyright law. The Google project epitomizes the essence of the copyright balance. The public will benefit from unrestricted access to works in the public domain along with more limited access to other work, all without the need to seek any prior permission. Authors will still enjoy copyright protection in their work and will frequently find that greater access leads to increased commercial success. While digitally scanning more than 10 million Canadian books and documents is a daunting task, the Google project illustrates that it is financially feasible. Reports suggest that it will cost Google approximately $10 to scan each book. Assuming similar costs for a Canadian project and a five-year timeline, the $20 million annual price tag represents a fraction of the total governmental commitment toward Canadian culture and Internet development. In fact, the most significant barriers to a national digital library do not arise from fiscal challenges but rather from two potential copyright reforms currently winding their way through the system. First, the federal government is contemplating reversing the decade-old policy of avoiding Internet licensing by creating a new licensing system for Internet content that would create new restrictions to accessing online content. By proposing a very narrow definition of what can be accessed without compensation, the plan would effectively force millions of Canadian students to pay for access to content that is otherwise publicly available. Despite opposition from the education community, the proposal is marching forward, constituting a significant setback to the goal of encouraging Internet use in Canada. Given the Supreme Court of Canada's recent commitment to copyright balance and robust user rights, it is clear that for most uses no license is needed to provide schools with appropriate access to online content such as a potential national digital library. With this in mind, this proposal should be quickly scrapped. Second, the Canadian Heritage Minister Liza Frulla's Copyright Policy Branch recently announced that this year it plans to launch a public consultation on a proposal to extend the term of copyright in Canada from its current 50 years after the death of the author to at least 70 years after death (authors enjoy exclusive copyright in their work from the moment of creation until 50 years after they die). Extending the copyright term would deal a serious blow to a national digital library because it would instantly remove thousands of works from the public domain. Although the U.S. and European Union have extended their copyright terms by an additional 20 years, the vast majority of the world's population lives in countries that have not. Those countries have recognized that an extension is unsupportable from a policy perspective. It will not foster further creative activity, it is not required under international intellectual property law, and it effectively constitutes a massive transfer of wealth from the public to the heirs of a select group of copyright holders. Given the economic and societal dangers associated with a copyright term extension, even moving forward with a consultation constitutes an embarrassing case of putting the interests of a select few ahead of the public interest. A new year is traditionally a time for bold, new resolutions. As Parliamentarians return to Ottawa, they should be encouraged to seize the opportunity to establish a national vision for the Internet that will again propel Canada into a global leadership position. Supported by appropriate copyright policies, a national digital library comprised of every Canadian book ever published would provide an exceptional resource for Canadians at home as well as advantageously promote the export of Canadian culture abroad. Michael Geist is the Canada Research Chair in Internet and E-commerce Law at the University of Ottawa. He is on-line at www.michaelgeist.ca. The opinions expressed herein are personal and do not necessarily reflect those of the University of Ottawa. From maitriv at yahoo.com Mon Jan 10 08:50:40 2005 From: maitriv at yahoo.com (maitri venkat-ramani) Date: Mon Jan 10 08:50:46 2005 Subject: [gutvol-d] Moore's Law (was Murphy's Law) Message-ID: <20050110165040.62629.qmail@web52307.mail.yahoo.com> Sorry, between my cold medicine intake and dealing with cantankerous video cards, it seems my Freudian slip was showing. Corrected message below. Laughs, Maitri --- maitri venkat-ramani wrote: I don't see the usefulness of comparing our growth to Moore's Law as even recent advances in computing is rendering this Law obsolete. Take my world of geological visualization, for instance. As explained by one of the specialists at Landmark Graphics, "With Linux 64-bit, you can have an unlimited volume of RAM on the system. The historical limit for the 32-bit system was around 2GB. Now, we're seeing systems that have 16GB of memory. No longer does compute power double every 18 months at the same price ... when things scale by 10 times, it's no longer just faster - you're in a different world." Granted, my system is not going to run at full spec speed at any given time owing to bandwidth quirks, etc., but it sure outdoes Moore's Law nowadays. If we need something to which to compare our growth rates, we should define our own new baseline curve and see how our growth proceeds from there. Since I personally don't know how to create one of these benchmarks, I will stop here and let someone else recommend. Maitri __________________________________ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 From hart at pglaf.org Mon Jan 10 09:47:47 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 09:47:49 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <41E1D033.2060002@adelaide.edu.au> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1D033.2060002@adelaide.edu.au> Message-ID: On Mon, 10 Jan 2005, Steve Thomas wrote: > Moore's Law actually has nothing to do with ebooks: "Moore's law is an > empirical observation stating, in effect, that at our rate of technological > development and advances in the semiconductor industry, the complexity of > integrated circuits doubles every 18 months." (From Wikipedia.) > > Michael Hart likes to compare PG growth with Moore's Law, but nobody should > place too much importance on this. The important thing is that PG continues > to grow, and that it continues to develop. When doing public relations, it is important to use references the general public is already familiar with. Moore's Law will be recognized by millions as the most popular growth curve and it is something Project Gutenberg has used all along. We also thus get continued recognition from those who know us. Whether Moore's Law is technically correct, etc., is not every person's cup of tea, it's a useful reference that people know. mh From hart at pglaf.org Mon Jan 10 10:11:28 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 10:11:29 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <41E1A2AB.4070108@perathoner.de> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1A2AB.4070108@perathoner.de> Message-ID: On Sun, 9 Jan 2005, Marcello Perathoner wrote: > Michael Hart wrote: > >> However, 1990 is the first year we actually made totally consistent >> additions every month to the collection, so it's the best start point. > > So you pick an arbitrary starting point to make the math come out right. > > Why not choose 1971 as starting point and accept where the math gets you: we > are *behind* Moore's Law. Hardly arbitrary, even as you yourself quoted above. 1990 was the first year of monthly production, a regular Newsletter, and most of the other things associated with Project Gutenberg. Growth in the 1970's was pretty much on a once a year basis, as there were severe limitations on our space allocations. The 1980's were pretty much devoded to Shakespeare and The Bible. Thus 1990 represents the best place to begin. If you think this is recent reasoning, I quote below from one of our old index files from the period: *** The Bible and Shakespeare represented the entire effort for the 1980's, and the Bible alone is about 1,000 times larger than our first file, the U.S. Declaration of Independence, and so is the Complete Shakespeare. [That Shakespeare was never released due to changes in the copyright law] Dec 1979 Abraham Lincoln's First Inaugural Address [linc1xxx.xxx] 9 Dec 1978 Abraham Lincoln's Second Inaugural Address [linc2xxx.xxx] 8 Dec 1977 The Mayflower Compact [mayflxxx.xxx] 7 Dec 1976 Give Me Liberty Or Give Me Death, Patrick Henry [liberxxx.xxx] 6 Dec 1975 The United States' Constitution [constxxx.xxx] 5 Nov 1973 Gettysburg Address, Abraham Lincoln [gettyxxx.xxx] 4 Nov 1973 John F. Kennedy's Inaugural Address [jfkxxxxx.xxx] 3 Dec 1972 The United States' Bill of Rights [billxxxx.xxx] 2 Dec 1971 Declaration of Independence [whenxxxx.xxx] 1 From hart at pglaf.org Mon Jan 10 10:13:38 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 10:13:40 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: <41E19E0E.6000307@perathoner.de> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E19E0E.6000307@perathoner.de> Message-ID: As I said, you can pick the high or low points and make it appear as if the growth rate was either much greater or much less than the Moore's Law prediction. However, I didn't believe anyone would be silly enough to DO it, and expect any credence. . . . Michael On Sun, 9 Jan 2005, Marcello Perathoner wrote: > Michael Hart wrote: > >>> This far exceeds Moore's Law projections from 10 eBooks in 1990, >>> which would predict 15,000 around August, 2006, and which every >>> pundit has continually said was an impossible growth rate: >>> >>> Projected Growth Rate >>> >>> Total Date Doubled Years >>> >>> 10 Dec, 1990 0 0 >>> 20 Jun, 1992 1 1.5 >>> 40 Dec, 1993 2 3 >>> 80 Jun, 1995 3 4.5 >>> 160 Dec, 1996 4 6 >>> 320 Jun, 1998 5 7.5 >>> 640 Dec, 1999 6 9 >>> 1280 Jun, 2001 7 10.5 >>> 2560 Dec, 2002 8 12 >>> 5120 Jun, 2004 9 13.5 >>> 10240 Dec, 2005 10 15 >>> 15000 Aug, 2006 10.5 15+ <<< Predicted Date for ~15,000 >>> 20480 Jun, 2007 11 16.5 > > > Bzzzzt, wrong. But thank you for playing! > > > You tried to show that the number of books in the collection obeys Moore's > Law. Moore's Law tries to fit the data to an 2 ^ t exponential curve with a > doubling rate of 1.5 years. > > In that case we have: you started in 1971 and we have reached 10.000 books by > the end of 2003. That's roughly 33 years for 10000 books. > > With 33 years and 10000 books we get: > > x * 2 ^ (33 / 1.5) = 10000 > > and we solve: > > x = 0.002384 > > > A year later than book 10.000 we should have gotten to: > > 0.002384 * 2 ^ (34 / 1.5) = 15873 > > which we have failed to do. > > > We should get to #20.000 a year and a half after #10.000. That would be May > 2005. > > > So much for Moore's Law, which, by the way, doesn't work well in computer > science either, but is for some strange reason one of the most-cited "Laws". > > > I'll attach a plot of the function: > > 0.002384 * 2 ^ ((x - 1971) / 1.5) > > starting at x = 2000 and ending at x = 2008. That is, if the attachement > comes thru. Otherwise use Gnuplot to plot it yourself. > > > > -- > Marcello Perathoner > webmaster@gutenberg.org > > From hart at pglaf.org Mon Jan 10 10:17:37 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 10:17:39 2005 Subject: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Re In-Reply-To: <41E190F8.55017770@ibiblio.org> References: <41E190F8.55017770@ibiblio.org> Message-ID: On Sun, 9 Jan 2005, Michael Dyck wrote: > Gardner Buchanan wrote: >> >> On 12:53:14 Miranda van de Heijning wrote: >>> >>> Other than that, congratulations on reaching 15,000! >>> What was posted as eBook 15000? >> >> As near as I can make out, we are not there yet. 14639 is the >> highest number I can see and it was posted just now: 9-jan-2005. > > I suspect Michael Hart has included the PG of Australia collection > (400 on Wednesday) in his grand total. We always have, with PGAU's permission. Hopefully we will soon be adding some from PGEU, PG Canada, etc. Michael From hart at pglaf.org Mon Jan 10 10:23:03 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 10:23:03 2005 Subject: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Re In-Reply-To: References: Message-ID: On Sun, 9 Jan 2005, Gardner Buchanan wrote: > Hi Miranda, > > On 12:53:14 Miranda van de Heijning wrote: >> >> >> Other than that, congratulations on reaching 15,000! What was posted as >> eBook 15000? >> > > As near as I can make out, we are not there yet. 14639 is the > highest number I can see and it was posted just now: 9-jan-2005. There were several candiates being discussed for the actual file to be labeled as /15000 a few months ago, but it's been pretty quiet about what has actually been chosen. . . . ;-) From shimmin at uiuc.edu Mon Jan 10 10:27:53 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Mon Jan 10 10:28:02 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released] In-Reply-To: <20050110121211.GA25054@clipper.ens.fr> References: <027201c4f69a$5ac605a0$3201a8c0@pink> <20050110121211.GA25054@clipper.ens.fr> Message-ID: <41E2C929.1020904@uiuc.edu> Whether production is levelling off or not, I increasingly find one conclusion inexorable: A few thousand of us, in a few years, have produced 15,000 books. The public domain contains, say, 10 million volumes. To digitize it using our current methods in a reasonable amount of time will require a million-person volunteer force. I'm not sure how to recruit a million proofreaders, but if anyone has some good ideas for finding the next 10,000, we should listen. -- RS From marcello at perathoner.de Mon Jan 10 10:36:00 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 10 10:36:04 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released In-Reply-To: References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E19E0E.6000307@perathoner.de> Message-ID: <41E2CB10.3050609@perathoner.de> Michael Hart wrote: > As I said, you can pick the high or low points and make it appear > as if the growth rate was either much greater or much less than > the Moore's Law prediction. > > However, I didn't believe anyone would be silly enough to DO it, > and expect any credence. . . . YOU did it. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Mon Jan 10 09:51:37 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 10 10:39:02 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: <007501c4f723$35198f00$6501a8c0@novocon.net> References: <007501c4f723$35198f00$6501a8c0@novocon.net> Message-ID: <41E2C0A9.5030200@perathoner.de> David Widger wrote: > A new quirk in access to our PG files and not a happy one. A main advantage of > the new directory system is direct access to a file when its url is entered and > until this morning that has been the case. I now find that when a direct link is > entered one is _not_ taken directly to the file but rather to the PG catalog. I > can only imagine this an unintended side effect of some program change. > > For example: > > http://www.gutenberg.org/dirs/3/1/7/3176/3176-h/3176-h.htm > > has for the past year taken users directly to Twain's "Innocents Abroad", now it > takes them to the bibrec where they have to choose again from a long confusing > list of files. I cannot reproduce this. Clicking the url in the mail or copy and pasting the url into a browser window gives me the file and not the bibrec. This is related to a recent change in the site programming which I am testing. I redirect all deep links to files from external pages to the bibrec page. This has some advantages: - the link won't go dead when we REPost the file - the user gets a choice of formats - the user doesn't get an outdated edition - our books get a better google ranking - the user gets to see our site. On the downside, sometimes you have to click some more to get the file you want. How does this work? _If_ the browser provides a referrer _and_ the referrer is not from our site, the user gets redirected. When opening the file from a bookmark or entering the url in the location bar, the browser should send _no_ referrer, and the user will not be redirected. When clicking on a link on a page, the browser should send a referrer (can be turned off). The list of "good" referrers, which will never be redirected, contains www.gutenberg.org only, but can be expanded to contain any "Independent Gutenberg Search" site. How did you access the file ? I'd like to try this approach for a few weeks and see if I hit any "hard" problems with it. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Mon Jan 10 10:07:12 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 10 10:39:09 2005 Subject: [gutvol-d] Fw: [gweekly] 15, 00th Project Gutenberg eBook Released In-Reply-To: <20050110101213.GB20581@clipper.ens.fr> References: <014501c4f64a$3127ef10$0302a8c0@PAULANDMIRANDA> <41E1A194.3A3DFEF9@ibiblio.org> <027201c4f69a$5ac605a0$3201a8c0@pink> <41E1C34F.4060600@verizon.net> <20050110101213.GB20581@clipper.ens.fr> Message-ID: <41E2C450.2010102@perathoner.de> Sebastien Blondeel wrote: > Suggestion For Improvements: Work on PG and PGDPs's Home Pages > -------------------------------------------------------------- > > ... The > ebooksgratuits.com webmaster has a very active group of people doing > e-books in Word (I'm working on a filter to help them transform that in > PG- acceptable formats, such as TXT or XHTML). He thinks a BIG reason > why PG and PGDPs are not successful is the fact that the websites are > not clear, not sexy, etc. gutenberg.org ranks 11,132th in the Alexa stats and ebooksgratuits.com ranks 690,225th. gutenberg.org reaches 79 out of a million web users, ebooksgratuits.com reaches 0.75 out of a million web users. See: http://www.alexa.com/data/details/traffic_details?&range=3m&size=large&y=t&url=gutenberg.org http://www.alexa.com/data/details/traffic_details?&range=3m&size=large&y=t&url=ebooksgratuits.com People should at least try to get the facts, before opening their BIG mouths. > You can have a look at his site or ask him for > details to know what he means. I think he'd better take a look at our site. -- Marcello Perathoner webmaster@gutenberg.org From hart at pglaf.org Mon Jan 10 10:39:18 2005 From: hart at pglaf.org (Michael Hart) Date: Mon Jan 10 10:39:20 2005 Subject: [gutvol-d] National Web library do-able, affordable, visionary In-Reply-To: <169faf16c41d.16c41d169faf@ncf.ca> References: <169faf16c41d.16c41d169faf@ncf.ca> Message-ID: Thanks. . .emails him at mgeist@uottawa.ca From ag737 at freenet.carleton.ca Mon Jan 10 12:04:42 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Mon Jan 10 12:04:50 2005 Subject: [gutvol-d] Re: National Web library do-able, affordable, visionary Message-ID: <174073171d6a.171d6a174073@ncf.ca> Also contact the Librarian and Archivist of Canada, Mr. Ian Wilson: Ian.Wilson@lac-bac.gc.ca From shalesller at writeme.com Mon Jan 10 12:43:16 2005 From: shalesller at writeme.com (D. Starner) Date: Mon Jan 10 12:43:25 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15,000th Project Gutenberg eBook Released] Message-ID: <20050110204316.479B14BDAB@ws1-1.us4.outblaze.com> > You are touching here the problem of the (lack of) editorial policy of > PG / PGDP. Why is this a problem? And if you see it as a problem, why don't you fix it? Jon Ingram thought that PG was missing good, complete editions of Chaucer and Pope and Dryden and Wordsworth, but instead of trying to tell me what to scan, he started scanning complete editions of those authors. It's a much more productive solution. > All this makes for a not very coherent, consistent editorial policy. I > guess literature people can easily criticize the PG French catalog (some > very obscure books, and some blatant misses). It's called a library. I'm sitting in a library that reached two million volumes a few years ago, and they have some very obscure books, and at the same time has some blatant misses. (For example, they have only 10 volumes of Edgar Rice Burroughs, and half of those are in special collections.) > They're not hackers, they > don't have this culture of "let's get involved, roll up our sleeves and > change the world", but still they could be useful to PG. How? Like many volunteer groups, we already have many people who want to run things already. Like many successful volunteer groups, PG goes out of its way to give a lot of freedom to the people actually doing the work. If they're not willing to roll up their sleeves and do something, how can they be useful to PG? -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From jtinsley at pobox.com Mon Jan 10 12:49:10 2005 From: jtinsley at pobox.com (Jim Tinsley) Date: Mon Jan 10 12:49:59 2005 Subject: [gutvol-d] Re: Problem in file retrieval Message-ID: <20050110204910.GB17222@panix.com> On Mon, 10 Jan 2005 18:51:37 +0100, Marcello Perathoner wrote: > >This is related to a recent change in the site programming which I am >testing. I redirect all deep links to files from external pages to the >bibrec page. I would never have noticed this, since I block referer information from my browser precisely to prevent sites doing this to me. However, despite having no personal problem with it, I suggest you reconsider. I think that we're coming from different angles at this question. What you regard as "deep links", I regard as "the texts" -- the things we're here to make and distribute. David, I suggest you change your pages to point to archive.org, or Sailor, or something. That will allow readers to access your books directly. I'm sure any sites, especially search sites, pointing to us whose webmasters notice this will do the same. jim From jeroen.mailinglist at bohol.ph Mon Jan 10 12:59:59 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Mon Jan 10 12:58:59 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: <41E2C0A9.5030200@perathoner.de> References: <007501c4f723$35198f00$6501a8c0@novocon.net> <41E2C0A9.5030200@perathoner.de> Message-ID: <41E2ECCF.7090608@bohol.ph> Marcello Perathoner wrote: > > When opening the file from a bookmark or entering the url in the > location bar, the browser should send _no_ referrer, and the user will > not be redirected. Maybe you could add known on-line mail readers and things with "webmail" or "neomail" in the referrer string to the exception list.... I sometimes read my email from a website. I use the same technique to stop hotlinking images of my website -- did you also achieve that effect? Jeroen From sharris at steveharris.net Mon Jan 10 13:03:53 2005 From: sharris at steveharris.net (steve harris) Date: Mon Jan 10 13:01:41 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released] In-Reply-To: <20050110121211.GA25054@clipper.ens.fr> Message-ID: On Jan. 10, Sebastian Blondeel wrote: > On Sun, Jan 09, 2005 at 02:55:28PM -0800, steve harris wrote: > > While one important issue is the post-proofing bottleneck > in DP (which > > is being given attention), as important but more fundamental is > > whether > > Can you give details about this bottleneck issue? > There are about 2200 books in DP that have been proofed, but are in various stages of Post-porrfing processing. You can see the specifics at DP's Stats Central. > > On the other hand, it would be difficult to set up an > official editorial > board: of course it should not be too bureaucratic and > complicated, of course it should not have a monopoly of the > books proposed to PGDP (PMs would still be free to kick in > books they just like, keeping in mind they will delay the > more "important" books. We work in limited resources, so we > should define priorities). > I don't support the need for an 'official editorial board', certainly not a group to exclude one work or another. At the same time, I think it would help if there was a group/process that gathered a list of works we would encourage people to work on. I did my own for the past two years at www.steveharris.net/PGList.htm . > But above all we are missing the competent people: I guess a > bunch of University professors specialized in pre-XXth > century literature, history, philosophy etc. would do, but > how many of those know PG? (If you don't like scholars > because they tend to be non pragmatic and argue about > pointless details, replace that with: essay writers, > journalists, whoever is important in the "culture" of the > language considered). > I think it would be a great set of projects if someone wanted to contact the MLA or American historical Association or other group and worked with them on generating a list of key works in each area. It's the sort of contact that could lead to greater uses of the PG collection, as well. More broadly, PG has focused on copyright-production-posting segments. A more robust view extending to both text collection and distribution/use of the materials would be a good way to be more effective in our core functions as well as extend the scope and usefulness of our product. I also think it would be useful if PG were to have enough management that such efforts could be endorsed and facilitated, not just left to people working on their own. To me, the open source coding groups, like the apache foundation or mozilla are useful non-coercive organizational models. Thx, smh > From miranda_vandeheijning at blueyonder.co.uk Mon Jan 10 13:19:30 2005 From: miranda_vandeheijning at blueyonder.co.uk (Miranda van de Heijning) Date: Mon Jan 10 13:19:48 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15, 000thProject Gutenberg eBook Released] References: <20050110204316.479B14BDAB@ws1-1.us4.outblaze.com> Message-ID: <00ad01c4f75a$13dd8b20$0302a8c0@PAULANDMIRANDA> Sebastien Blondeel wrote: > All this makes for a not very coherent, consistent editorial policy. I > guess literature people can easily criticize the PG French catalog (some > very obscure books, and some blatant misses). There's a case to be made that those obscure books are exactly the ones that need scanning and archiving most badly. I'm sure cheap copies of classic works will always be relatively easily available, but who is going to reprint all those long-forgotten authors whose works are wasting away in attics and recycle shops? I'm all in favour of getting more obscure books into PG. To me PG is a museum as much as it is an archive. Miranda From shalesller at writeme.com Mon Jan 10 13:27:51 2005 From: shalesller at writeme.com (D. Starner) Date: Mon Jan 10 13:28:02 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology Message-ID: <20050110212752.45C984BE65@ws1-1.us4.outblaze.com> Instead of just general fiction, let me ramble on for a few minutes about a few niches I'm familiar with. A 28 year copyright term would be releasing the very first volumes in roleplaying games into the public domain. A 14 year copyright term might have some economic effect on the industry, but primarily on companies that bought the rights to reprint old material. (Looking at Dover, one could easily argue that they could stay in buisness.) Pretty much everyone who wants the mainstream material that's 15 years old either has a copy or can find a used copy cheap. The non-mainstream material is, as a general rule, gone. Not because it's bad, but the market is dominated by a few licenses and a few more-or-less generic games. The games where you roleplay crawfish in a post-apocalyptic future weren't bad, they were just odd. They're part of the long tail, as mentioned in the Wired article someone pointed out, hurt further by the fact the whole industry is part of the long-tail; big industry players consider 10,000 books to be a large print run. Twenty-eight years in the computer industry would hurt about 10 books. Andrew Tannenbaum and Donald Knuth would have to worry about people updating older editions and competing with them. Fortunately for them, the universities that employ them don't plan to stop paying them anytime soon. In exchange for slighly hurting those two dignitaries economically, we have access without question to a wealth of information about early computers and the early history of the industry, little of which is in print and equally little of which is available to the average programmer. But don't worry: all the precious books of BASIC programs for the Commodore 64 will still be under copyright for a long time to come. > Don't deprive millions of readers for the sake of a relatively > small number who want to read poor quality older books. The > really good stuff, the stuff people are actually interested in reading, > tends to stay in print. There's large volumes of niche material that is high quality and interesting, but is out of print because of the relatively small market. Even some highly lauded material, like the Lensman series, spent long periods of time out of print. Not everyone wants to read Stephen King or the other authors of the day, and not everything falls into the mass-market paperback category. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From shimmin at uiuc.edu Mon Jan 10 13:35:54 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Mon Jan 10 13:36:03 2005 Subject: [gutvol-d] Re: PG editorial policy? In-Reply-To: References: Message-ID: <41E2F53A.8080909@uiuc.edu> With regard to making a DP editorial board: The proofreaders are not a machine. They will not proof whatever is put in front of them with the same willingness or vigor. I attribute much of the increase in productivity at DP between summer 2003 and the present on the transition from a system that presented projects on a more-or-less first-in, first-out basis, to one that tried to ensure that at least some "easy" English material is present at all times. More recently, this system has been broadened so that (in English and French anyway, our two most popular languages), several genre-based queues attempt to ensure that at least one or two projects in that genre are available for proofing at any time. (This also provides an incentive for content providers to provide material that the proofers enjoy proofing more. Such material releases faster, and most human beings, on some level, are suckers for instant gratification.) If there existed a generally-agreed-upon canon of books that was in some sense more important to get into PG faster than other books, the best non-coercive way I can think of to encourage work on these books is to give them a queue of their own at DP. If proofers enjoyed working on them, it would become a fast queue, and this would encourage content providers to scan books off the list. However, I don't really see the idiosyncracies of what is and is not in PG as a problem. Every library smaller than the major national libraries exhibits the idiosyncracies of its acquisitions department. Every book in PG has this much to recommend it: someone thought it worthwhile enough to go through the effort of putting it there. -- RS From jonhendry at mac.com Mon Jan 10 13:41:14 2005 From: jonhendry at mac.com (Jonathan Hendry) Date: Mon Jan 10 13:41:29 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <20050110212752.45C984BE65@ws1-1.us4.outblaze.com> References: <20050110212752.45C984BE65@ws1-1.us4.outblaze.com> Message-ID: <59F7ED4E-6350-11D9-97DA-000A956D5546@mac.com> On Jan 10, 2005, at 4:27 PM, D. Starner wrote: > Twenty-eight years in the computer industry would hurt about 10 books. > Andrew > Tannenbaum and Donald Knuth would have to worry about people updating > older editions > and competing with them. I suspect Knuth's greater concern would be the loss of quality control, and the possibility of unauthorized, updated/modified editions having errors - possibly significant errors - while still being attributed to Knuth. (Especially since Knuth pays people for finding errors in his work.) From jeroen.mailinglist at bohol.ph Mon Jan 10 14:35:54 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Mon Jan 10 14:34:31 2005 Subject: [gutvol-d] no flame. suggestion, comments, apology In-Reply-To: <59F7ED4E-6350-11D9-97DA-000A956D5546@mac.com> References: <20050110212752.45C984BE65@ws1-1.us4.outblaze.com> <59F7ED4E-6350-11D9-97DA-000A956D5546@mac.com> Message-ID: <41E3034A.6060800@bohol.ph> Jonathan Hendry wrote: > > I suspect Knuth's greater concern would be the loss of quality > control, and the possibility of unauthorized, updated/modified > editions having errors - possibly significant errors - while still > being attributed to Knuth. > > (Especially since Knuth pays people for finding errors in his work.) The quality control excuse I've heard once too often from people republishing public domain works and claiming fresh copyrights. This issue is simply solved by adding digital signatures on works, or in the physical book world, by using trademarks. Unauthorized is just another word for unlicensed, and yes, that is what we want: no need to ask permission. Jeroen. From herber at thing.com Mon Jan 10 12:17:24 2005 From: herber at thing.com (Steve Herber) Date: Mon Jan 10 14:47:31 2005 Subject: [gutvol-d] Top 1000 collection list and suggestion Message-ID: I found this list recently: TECH BIT: Top 1000 Most Widely Held Library Books OCLC Research, a division of the international Online Computer Library Center, has compiled a list of the top 1000 titles owned by its 50,000+ member libraries. View the top 10 (which includes the Bible, Mother Goose, and The Lord of the Rings) or all 1000 titles online. http://www.oclc.org/research/top1000/ http://www.oclc.org/research/top1000/complete.htm An annotated version of this list with two extra pieces of data would make an interesting Project Gutenberg web page. I would like to see a link to the Gutenberg edition and the date the item went into or will go into the public domain. I think some people will start to see the negative aspect of the long copyright times at the same time discovering how many documents are available from the Project. I am not currently on the mail list, so you may have already discussed this list. Cheers, -- Steve Herber herber@thing.com work: 206-221-7262 Security Engineer, UW Medicine, IT Services home: 425-454-2399 From marcello at perathoner.de Mon Jan 10 15:00:55 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 10 15:04:40 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: <41E2ECCF.7090608@bohol.ph> References: <007501c4f723$35198f00$6501a8c0@novocon.net> <41E2C0A9.5030200@perathoner.de> <41E2ECCF.7090608@bohol.ph> Message-ID: <41E30927.9080409@perathoner.de> Jeroen Hellingman (Mailing List Account) wrote: > Maybe you could add known on-line mail readers and things with "webmail" > or "neomail" in the referrer string to the exception list.... I > sometimes read my email from a website. I use the same technique to stop > hotlinking images of my website -- did you also achieve that effect? I turned off image inlining a while ago. Nobody noticed so far. -- Marcello Perathoner webmaster@gutenberg.org From gbnewby at pglaf.org Mon Jan 10 16:52:15 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jan 10 16:52:16 2005 Subject: PG editorial policy? [Re: [gutvol-d] Fw: [gweekly] 15, 000th Project Gutenberg eBook Released] In-Reply-To: <20050110204316.479B14BDAB@ws1-1.us4.outblaze.com> References: <20050110204316.479B14BDAB@ws1-1.us4.outblaze.com> Message-ID: <20050111005214.GA27797@pglaf.org> On Mon, Jan 10, 2005 at 12:43:16PM -0800, D. Starner wrote: > > You are touching here the problem of the (lack of) editorial policy of > > PG / PGDP. > > Why is this a problem? And if you see it as a problem, why don't you fix > it? Jon Ingram thought that PG was missing good, complete editions of > Chaucer and Pope and Dryden and Wordsworth, but instead of trying to > tell me what to scan, he started scanning complete editions of those > authors. It's a much more productive solution. > > > All this makes for a not very coherent, consistent editorial policy. I > > guess literature people can easily criticize the PG French catalog (some > > very obscure books, and some blatant misses). > ... Just a quick note: the question was not really about editorial policy, but collection development policy. We *do* have an editorial policy, which is spelled out in our FAQ & in DP's procedures (some of it is enforced, some is just guidance). As far as collection development (which is a course you can take in most Library Science degree programs, BTW): I once started to try to write our PG collection development policy. What I realized is that we really already have one, even though it's not spelled out explicitly as such. It's what people have said in this thread: those who do the work to create eBooks get to see those eBooks go into the collection. This de facto collection development policy permeates many of our documents, such as http://gutenberg.org/about Sometimes "the work" is more than just finding/scanning/OCRing/ proofing a particular book. For example, people who want to use particular languages, fonts etc. that don't work as well with OCR software & the existing software at DP or elsewhere might find they need to develop some additional infrastructure to get going. But for most books (I'd guess well over 90% of printed items in the public domain, worldwide, in any language), the work simply consists of doing the scanning, OCR & proofing. Working with DP might be a great fit, or you might prefer to do your own project solo. As always, there are larger issues (for example, how contemporary works, audio eBooks, video, etc. fit with our main focus on public domain from print). But on the whole, I think we have a very clear & unambiguous collection development policy. -- Greg From j.hagerson at comcast.net Mon Jan 10 17:02:11 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Mon Jan 10 17:02:26 2005 Subject: [gutvol-d] Re: PG editorial policy? In-Reply-To: <41E2F53A.8080909@uiuc.edu> Message-ID: <002101c4f779$31cfec30$6401a8c0@sarek> One of the content providers within Distributed Proofreaders has found an MLA (I think) list of the top ten books for each of the years just prior to 1923. He has taken it upon himself (he has access to the Library of Congress in DC) to scan and provide each of these books in fiction and non-fiction. This is one person seeing an opportunity and moving on it. No "editorial board" told him that this is what he was "supposed" to do. He just did it. And we all benefit. John Hagerson From phil at thalasson.com Mon Jan 10 17:24:24 2005 From: phil at thalasson.com (Philip Baker) Date: Mon Jan 10 17:25:55 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: <41E2C0A9.5030200@perathoner.de> Message-ID: Marcello Perathoner wrote: >David Widger wrote: > >> A new quirk in access to our PG files and not a happy one. A main advantage >of >> the new directory system is direct access to a file when its url is entered >and >> until this morning that has been the case. I now find that when a direct >link >is >> entered one is _not_ taken directly to the file but rather to the PG catalog. >I >> can only imagine this an unintended side effect of some program change. >> >> For example: >> >> http://www.gutenberg.org/dirs/3/1/7/3176/3176-h/3176-h.htm >> >> has for the past year taken users directly to Twain's "Innocents Abroad", now >it >> takes them to the bibrec where they have to choose again from a long >confusing >> list of files. > >I cannot reproduce this. Clicking the url in the mail or copy and >pasting the url into a browser window gives me the file and not the bibrec. > > >This is related to a recent change in the site programming which I am >testing. I redirect all deep links to files from external pages to the >bibrec page. This has some advantages: > > - the link won't go dead when we REPost the file > - the user gets a choice of formats > - the user doesn't get an outdated edition > - our books get a better google ranking > - the user gets to see our site. > >On the downside, sometimes you have to click some more to get the file >you want. > >How does this work? _If_ the browser provides a referrer _and_ the >referrer is not from our site, the user gets redirected. > >When opening the file from a bookmark or entering the url in the >location bar, the browser should send _no_ referrer, and the user will >not be redirected. > >When clicking on a link on a page, the browser should send a referrer >(can be turned off). The list of "good" referrers, which will never be >redirected, contains www.gutenberg.org only, but can be expanded to >contain any "Independent Gutenberg Search" site. > > Search engine bots don't usually give a referrer and so they will not see any change. -- Philip Baker From marcello at perathoner.de Mon Jan 10 23:01:56 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 10 23:02:07 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: References: Message-ID: <41E379E4.5090603@perathoner.de> Philip Baker wrote: > Search engine bots don't usually give a referrer and so they will not > see any change. Thats exactly what I want. Otherwise they would index the bibrec page instead of the book. The user clicking on the search results will get to the bibrec page, where she can select a file format and compression. -- Marcello Perathoner webmaster@gutenberg.org From ag737 at freenet.carleton.ca Tue Jan 11 12:41:31 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Tue Jan 11 12:41:41 2005 Subject: [gutvol-d] Top 1000 collection list and suggestion Message-ID: <19955c196ddd.196ddd19955c@ncf.ca> I'm on it, but can only do it considering life+50 and life+70. I don't have the time to apply the messy US law. I'll post up my results when they're ready. ----- Original Message ----- >From Steve Herber Date Mon, 10 Jan 2005 12:17:24 -0800 (PST) To gutvol-d@lists.pglaf.org Subject [gutvol-d] Top 1000 collection list and suggestion An annotated version of this list with two extra pieces of data would make an interesting Project Gutenberg web page. I would like to see a link to the Gutenberg edition and the date the item went into or will go into the public domain. I think some people will start to see the negative aspect of the long copyright times at the same time discovering how many documents are available from the Project. From joshua at hutchinson.net Tue Jan 11 12:53:17 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jan 11 12:53:25 2005 Subject: [gutvol-d] Top 1000 collection list and suggestion Message-ID: <20050111205317.301B710999C@ws6-4.us4.outblaze.com> Are you being serious? The US law is anything but messy. Instead of trying to find out exactly when the author died (and that can be harder than it sounds for obscure authors) or even more fun, all the editors, illustrators and contributors to a composite work, find out when they ALL died, take the one that died last .... With the US law, you just take the printed publication date and add 95 years. It is a ridiculously long copyright term, true, but it is very easy to determine. Josh ----- Original Message ----- From: "Wallace J.McLean" > > I'm on it, but can only do it considering life+50 and life+70. I don't > have the time to apply the messy US law. I'll post up my results when > they're ready. > > > ----- Original Message ----- > > From Steve Herber > Date Mon, 10 Jan 2005 12:17:24 -0800 (PST) > To gutvol-d@lists.pglaf.org > Subject [gutvol-d] Top 1000 collection list and suggestion > > An annotated version of this list with two extra pieces of data would > make an interesting Project Gutenberg web page. I would like to see a > link to the Gutenberg edition and the date the item went into or will > go into the public domain. I think some people will start to see the > negative aspect of the long copyright times at the same time > discovering how > many documents are available from the Project. > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From shalesller at writeme.com Tue Jan 11 14:11:03 2005 From: shalesller at writeme.com (D. Starner) Date: Tue Jan 11 14:11:13 2005 Subject: [gutvol-d] Top 1000 collection list and suggestion Message-ID: <20050111221103.229AD4BDAB@ws1-1.us4.outblaze.com> > The US law is anything but messy. [...] > With the US law, you just take the printed publication date and > add 95 years. Everything before 1923 is in the public domain, which is a little better. The base of US law is simple; everything printed before 1923 or 95 years old is in the public domain. But there are a lot of books in the public domain due to quirks and various rules that are hard to check. There's six or seven different rules you have to apply (for instance, a book published outside the US that was not registered or not renewed in the US that was out of copyright in the home nation in 1998 or whenever the copyright nation signed a copyright agreement (if later), then it's in the public domain.) In that sense, it's very messy. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From phil at thalasson.com Tue Jan 11 15:26:47 2005 From: phil at thalasson.com (Philip Baker) Date: Tue Jan 11 15:29:16 2005 Subject: [gutvol-d] Re: Problem in file retrieval In-Reply-To: <41E379E4.5090603@perathoner.de> Message-ID: In article <41E379E4.5090603@perathoner.de>, Marcello Perathoner writes >Philip Baker wrote: > >> Search engine bots don't usually give a referrer and so they will not >> see any change. > >Thats exactly what I want. Otherwise they would index the bibrec page >instead of the book. > >The user clicking on the search results will get to the bibrec page, >where she can select a file format and compression. > You listed getting a better Google ranking as one of the aims of the change. -- Philip Baker From sly at victoria.tc.ca Wed Jan 12 00:46:29 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jan 12 00:46:51 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: <20050111221103.229AD4BDAB@ws1-1.us4.outblaze.com> References: <20050111221103.229AD4BDAB@ws1-1.us4.outblaze.com> Message-ID: Re the recent discussion of various copyright regimes... Yes, the U.S. was a hold-out through most of the 20th century, avoiding the major convention (Berne) that most other Western nations were party to. But I would argue that if the U.S. did not have its unique copyright history, then Project Gutenberg would be rather a different thing today. Neither American or Canadian copyright laws are "better", they are just different. Each provides its own problems for those wanting to utilize the public domain. In a life+N regime, it's a little odd if you find yourself wishing someone had died earlier. (Just today, I looked up the dates for a book I was hoping would be public domain, only to find--the author died in 1968. Darn.) And you also have the uncertain situations. Here is a description from the LoC of a title that have availible, if I want it: LC Control Number: 32012123 Type of Material: Text (Book, Microform, Electronic, etc.) Personal Name: Powell, Van. [from old catalog] Main Title: The mystery crash, Published/Created: New York, Chicago, A. L. Burt company [c1932] Description: 285 p. front. 19 cm. LC Classification: PZ7.P88 Sk no. 1 Is it PD in Canada under life+50? Who knows? I've done some searching and cannot find any dates for the author. Andrew From shimmin at uiuc.edu Wed Jan 12 03:11:30 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Wed Jan 12 03:11:55 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: References: <20050111221103.229AD4BDAB@ws1-1.us4.outblaze.com> Message-ID: <41E505E2.9060607@uiuc.edu> Many L+?? regimes have provisions for a fixed term when the author's death date cannot readily be determined. I couldn't tell you if Canada is one of those jurisdictions. -- RS From ag737 at freenet.carleton.ca Wed Jan 12 12:21:35 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 12:21:45 2005 Subject: [gutvol-d] Re: gutvol-d Digest, Vol 6, Issue 27 Message-ID: <1bc35e1bbc8a.1bbc8a1bc35e@ncf.ca> I'm being dead serious. What would you rather do: look up the publication dates for 1001 books and the death dates of the author(s) for a large subset thereof? look up just the death dates for < 1001 authors? There are more books than authors. Just by confirming the death dates for CS Lewis, Charles Dickens, Hans Christian Anderson, Verdi, etc., gives me the public domain date for multiple books at a go. The US law is messy. The fact that it's nice and clean for pre-1923 imprints doesn't unmessy it. It's a goddamn mess. ----- Original Message ----- >From "Joshua Hutchinson" Date Tue, 11 Jan 2005 15:53:17 -0500 To "Project Gutenberg Volunteer Discussion" Subject re: [gutvol-d] Top 1000 collection list and suggestion Are you being serious? The US law is anything but messy. Instead of trying to find out exactly when the author died (and that can be harder than it sounds for obscure authors) or even more fun, all the editors, illustrators and contributors to a composite work, find out when they ALL died, take the one that died last .... With the US law, you just take the printed publication date and add 95 years. It is a ridiculously long copyright term, true, but it is very easy to determine. Josh -------------- next part -------------- Today's Topics: 1. re: Top 1000 collection list and suggestion (Wallace J.McLean) 2. re: Top 1000 collection list and suggestion (Joshua Hutchinson) 3. re: Top 1000 collection list and suggestion (D. Starner) 4. Re: Re: Problem in file retrieval (Philip Baker) 5. Canadian and American copyrights (Andrew Sly) 6. Re: Canadian and American copyrights (Robert Shimmin) -------------- next part -------------- Skipped content of type multipart/digest-------------- next part -------------- _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From ag737 at freenet.carleton.ca Wed Jan 12 12:30:59 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 12:31:08 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <1bb5711b5cc9.1b5cc91bb571@ncf.ca> >From Andrew Sly > But I would argue that if the U.S. did not have > its unique copyright history, then Project Gutenberg > would be rather a different thing today. Absolutely. PG holds a great number of books which are PD under the general rule, but not PD in life+50 or life+70 countries. > Neither American or Canadian copyright laws are > "better", they are just different. Each provides > its own problems for those wanting to utilize the > public domain. Absolutely, but, assuming no further changes to either US or Canadian law, Canada already has a larger public domain, and will continue to have a larger public domain, and one which grows every January 1st. The US public domain is smaller and frozen for more than another decade. > And you also have the uncertain situations. Here is > a description from the LoC of a title that have > availible, if I want it: This is a major lacuna in the Canadian law; some other countries with life+ regimes have provisions which allow you to assume, in the absence of death-date information, that the work is PD after a certain efflux of time. Probably not long enough to take care of a 1930s-era imprint, though. On the other hand, I've currently got a book in copy.pglaf.org limbo because it doesn't contain a publication or copyright date, and the library information from various libraries give contradictory multiple US and British publication dates ranging from 1919 to 1926. Under Canadian law, I know for a fact that this book is unequivocally in the public domain, with an author who died in 1939. I'm trying to prove the publication date -- which I know to be 1920 from extrinsic sources -- but it's a tough slog to clear. > Personal Name: Powell, Van. [from old catalog] > Is it PD in Canada under life+50? Who knows? I've done some > searching and cannot find any dates for the author. If there's any chance this is a pseudonym, then it's public-domain in Canada (publication+50 rule). From joshua at hutchinson.net Wed Jan 12 12:47:28 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Wed Jan 12 12:47:36 2005 Subject: [gutvol-d] Re: gutvol-d Digest, Vol 6, Issue 27 Message-ID: <20050112204728.9DEF04F4ED@ws6-5.us4.outblaze.com> Ok, and after you move on to Joe Jim Billy Bob, the obscure hill-billy author? While you're busy trying to track down when he died, I simply flipped the book open to page two, looked at the nice number printed there, and checked off "Yep, out of copyright" and moved on to the next book. As David pointed out, there are some circumstances where it can get ugly, but those are fairly rare. The majority of US copyright issues are easier to handle than the majority of Life+XX copyright issues. Josh ----- Original Message ----- From: "Wallace J.McLean" To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] Re: gutvol-d Digest, Vol 6, Issue 27 Date: Wed, 12 Jan 2005 15:21:35 -0500 > > I'm being dead serious. > > What would you rather do: > > look up the publication dates for 1001 books and the death dates of the > author(s) for a large subset thereof? > > look up just the death dates for < 1001 authors? > > There are more books than authors. Just by confirming the death dates > for CS Lewis, Charles Dickens, Hans Christian Anderson, Verdi, etc., > gives me the public domain date for multiple books at a go. > > The US law is messy. The fact that it's nice and clean for pre-1923 > imprints doesn't unmessy it. It's a goddamn mess. > > > ----- Original Message ----- > > From "Joshua Hutchinson" > Date Tue, 11 Jan 2005 15:53:17 -0500 > To "Project Gutenberg Volunteer Discussion" > Subject re: [gutvol-d] Top 1000 collection list and suggestion > > > Are you being serious? The US law is anything but messy. Instead of > trying to find out exactly when the author died (and that can be harder > than it sounds for obscure authors) or even more fun, all the editors, > illustrators and contributors to a composite work, find out when they > ALL died, take the one that died last .... With the US law, you just > take the printed publication date and add 95 years. It is a > ridiculously long copyright term, true, but it is very easy to > determine. > > Josh > > > Today's Topics: > > 1. re: Top 1000 collection list and suggestion (Wallace J.McLean) > 2. re: Top 1000 collection list and suggestion (Joshua Hutchinson) > 3. re: Top 1000 collection list and suggestion (D. Starner) > 4. Re: Re: Problem in file retrieval (Philip Baker) > 5. Canadian and American copyrights (Andrew Sly) > 6. Re: Canadian and American copyrights (Robert Shimmin) From: Wallace J.McLean To: gutvol-d@lists.pglaf.org Subject: re: [gutvol-d] Top 1000 collection list and suggestion Date: Tue, 11 Jan 2005 15:41:31 -0500 > > > I'm on it, but can only do it considering life+50 and life+70. I don't > have the time to apply the messy US law. I'll post up my results when > they're ready. > > > ----- Original Message ----- > > From Steve Herber > Date Mon, 10 Jan 2005 12:17:24 -0800 (PST) > To gutvol-d@lists.pglaf.org > Subject [gutvol-d] Top 1000 collection list and suggestion > > An annotated version of this list with two extra pieces of data would > make an interesting Project Gutenberg web page. I would like to see a > link to the Gutenberg edition and the date the item went into or will > go into the public domain. I think some people will start to see the > negative aspect of the long copyright times at the same time > discovering how > many documents are available from the Project. From: Joshua Hutchinson To: Project Gutenberg Volunteer Discussion Subject: re: [gutvol-d] Top 1000 collection list and suggestion Date: Tue, 11 Jan 2005 15:53:17 -0500 > > > Are you being serious? The US law is anything but messy. Instead of trying > to find out exactly when the author died (and that can be harder than it > sounds for obscure authors) or even more fun, all the editors, illustrators > and contributors to a composite work, find out when they ALL died, take the > one that died last .... With the US law, you just take the printed > publication date and add 95 years. It is a ridiculously long copyright term, > true, but it is very easy to determine. > > Josh > > > ----- Original Message ----- > From: "Wallace J.McLean" > > > > I'm on it, but can only do it considering life+50 and life+70. I don't > > have the time to apply the messy US law. I'll post up my results when > > they're ready. > > > > > > ----- Original Message ----- > > > From Steve Herber > > Date Mon, 10 Jan 2005 12:17:24 -0800 (PST) > > To gutvol-d@lists.pglaf.org > > Subject [gutvol-d] Top 1000 collection list and suggestion > > > > An annotated version of this list with two extra pieces of data would > > make an interesting Project Gutenberg web page. I would like to see a > > link to the Gutenberg edition and the date the item went into or will > > go into the public domain. I think some people will start to see the > > negative aspect of the long copyright times at the same time > > discovering how > > many documents are available from the Project. > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d From: D. Starner To: Project Gutenberg Volunteer Discussion Subject: re: [gutvol-d] Top 1000 collection list and suggestion Date: Tue, 11 Jan 2005 14:11:03 -0800 > > > > The US law is anything but messy. [...] With the US law, you just take the > > printed publication date and add 95 years. > > Everything before 1923 is in the public domain, which is a little > better. The base of US law is simple; everything printed before 1923 > or 95 years old is in the public domain. But there are a lot of books > in the public domain due to quirks and various rules that are hard to > check. There's six or seven different rules you have to apply (for > instance, a book published outside the US that was not registered or > not renewed in the US that was out of copyright in the home nation in > 1998 or whenever the copyright nation signed a copyright agreement (if > later), then it's in the public domain.) In that sense, it's very messy. > -- > ___________________________________________________________ > Sign-up for Ads Free at Mail.com > http://promo.mail.com/adsfreejump.htm From: Philip Baker To: gutvol-d@lists.pglaf.org Subject: Re: [gutvol-d] Re: Problem in file retrieval Date: Tue, 11 Jan 2005 23:26:47 +0000 > > > In article <41E379E4.5090603@perathoner.de>, Marcello Perathoner > writes > > Philip Baker wrote: > > > >> Search engine bots don't usually give a referrer and so they will not > >> see any change. > > > > Thats exactly what I want. Otherwise they would index the bibrec page > > instead of the book. > > > > The user clicking on the search results will get to the bibrec page, where > > she can select a file format and compression. > > > > You listed getting a better Google ranking as one of the aims of the > change. > -- > Philip Baker From: Andrew Sly To: Project Gutenberg Volunteer Discussion Subject: [gutvol-d] Canadian and American copyrights Date: Wed, 12 Jan 2005 00:46:29 -0800 (PST) > > > > Re the recent discussion of various copyright regimes... > > Yes, the U.S. was a hold-out through most of the 20th > century, avoiding the major convention (Berne) that > most other Western nations were party to. > > But I would argue that if the U.S. did not have > its unique copyright history, then Project Gutenberg > would be rather a different thing today. > > Neither American or Canadian copyright laws are > "better", they are just different. Each provides > its own problems for those wanting to utilize the > public domain. > > In a life+N regime, it's a little odd if you > find yourself wishing someone had died earlier. > > (Just today, I looked up the dates for a book I was > hoping would be public domain, only to find--the > author died in 1968. Darn.) > > And you also have the uncertain situations. Here is > a description from the LoC of a title that have > availible, if I want it: > > LC Control Number: 32012123 > Type of Material: Text (Book, Microform, Electronic, etc.) > Personal Name: Powell, Van. [from old catalog] > Main Title: The mystery crash, > Published/Created: New York, Chicago, A. L. Burt company [c1932] > Description: 285 p. front. 19 cm. > LC Classification: PZ7.P88 Sk no. 1 > > Is it PD in Canada under life+50? Who knows? I've done some > searching and cannot find any dates for the author. > > > Andrew From: Robert Shimmin To: Project Gutenberg Volunteer Discussion Subject: Re: [gutvol-d] Canadian and American copyrights Date: Wed, 12 Jan 2005 05:11:30 -0600 > > > Many L+?? regimes have provisions for a fixed term when the author's death > date cannot readily be determined. I couldn't tell you if Canada is one of > those jurisdictions. > > -- RS > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From sly at victoria.tc.ca Wed Jan 12 12:52:58 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jan 12 12:53:08 2005 Subject: [gutvol-d] Latin quote in Sherlock Holmes Message-ID: For general interest, here's an posting from alt.language.latin which mentioned Project Gutenberg. It's not very complimentry, but it does say that we got the passage in question correct. Andrew Newsgroups: alt.language.latin Date: 2005-01-12 04:06:38 PST James A. Temple wrote: > "August de Man" wrote in message > news:41e4e2b0$0$47653$cd19a363@news.wanadoo.nl... > >> I wondered about "contemplar", and rightly so, because Dr. Watson >> doesn't > cite Horace quite correctly. It should have been: >> sic solitus: 'populus me sibilat, at mihi plaudo >> ipse domi, simul ac nummos contemplor in arca.' >> Search for "nummos contemplar" and it's all about Sherlock Holmes; >> search for "nummos contemplor", and you find the Latin texts. > > Fascinating! The text from which I copied the line attributed to Dr. > Watson was from the Easton Press publication of "The Adventures of > Sherlock Holmes". An excerpt from the title page reads, "A > definitive text, corrected and edited by Edgar W. Smith, ...". I > suppose we shall never know whether Sir Arthur Conan Doyle was > responsible for the errant "letter", followed by an oversight of > Edgar W. Smith or whether Smith made the change himself. The plot > thickens. As a general rule (and you should have learned this in the course of a long life), it is unwise to place too much trust in what you might read on title pages. Simple errors such as this are easily introduced by compositors when reprinting, for example for American editions (which were often pirated). For what it is worth, the Project Gutenberg eText version (and Project Gutenberg are a byword for inaccuracy, mainly due to their choice of editions) has the quotation correctly. They claim to be reproducing the 1887 text, if you can believe that :-) From nwolcott at dsdial.net Wed Jan 12 13:33:52 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Wed Jan 12 13:35:41 2005 Subject: [gutvol-d] Andale Mono font Message-ID: <001001c4f8ee$a29b2d00$d49495ce@gw98> I used to have the Andale Mono ttf font, but since I reinstalled windows and Office it has disappeared. Where did it come from ? Font has all distinguishable characters. N Wolcott nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050112/604a94f3/attachment.html From shalesller at writeme.com Wed Jan 12 13:54:15 2005 From: shalesller at writeme.com (D. Starner) Date: Wed Jan 12 13:54:27 2005 Subject: [gutvol-d] Re: gutvol-d Digest, Vol 6, Issue 27 Message-ID: <20050112215415.BB6154BDAB@ws1-1.us4.outblaze.com> > What would you rather do: > > look up the publication dates for 1001 books and the death dates of the > author(s) for a large subset thereof? I don't have to look up death dates. This is as much a complaint that it's different rather than it's messier. > look up just the death dates for < 1001 authors? I found "Oklahoma, and other poems" in my library. It took 30 seconds to verify that it was in the public domain in the US. It took 20 minutes of searching by library staff to find out that he died in 1951. Mind you, any other library in the world would have taken longer or failed to come up with anything, as he used to live in this area and his death date was found in a book of local gravestones. Finding the publication date was worlds easier. You complain that a book you have doesn't have a publication date. In my experience, that's a rarity, and I still fail to find to understand why European publishers do that. If you don't have a publication date, how do you know that it's not a modern edition, and hence has typographical copyright in the EU, or even new editorial content? How do you prove that to the satisfaction of PG, who might have to prove it in court someday? > There are more books than authors. Just by confirming the death dates > for CS Lewis, Charles Dickens, Hans Christian Anderson, Verdi, etc., > gives me the public domain date for multiple books at a go. It's not that simple. Just because Baudelaire died before 1935, doesn't mean the translator for Les Fleurs du Mal died before 1935, nor does it tell you who the translators were. You have to look up the translations of Les Fleurs du Mal, which will usually give you the publications dates, and then look up who the translators were and when they died, and they are much more unlikely to be well documented. If you go through that list, and mark The Flowers of Death as clearable because of when Baudelaire died, it's wrong; you've got to find out when it was translated into English and then when the translator died (much harder than the first.) Honestly, just confirming the death dates for most authors tells you that all of thier works are in the public domain in the US. The exception, works published after their death or still unpublished, are often still under copyright in life+x places, too. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From shalesller at writeme.com Wed Jan 12 14:02:06 2005 From: shalesller at writeme.com (D. Starner) Date: Wed Jan 12 14:02:17 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <20050112220206.825804BDAB@ws1-1.us4.outblaze.com> > > Personal Name: Powell, Van. [from old catalog] > > Is it PD in Canada under life+50? Who knows? I've done some > > searching and cannot find any dates for the author. > > If there's any chance this is a pseudonym, then it's public-domain in > Canada (publication+50 rule). So copyright law in Canada is designed to screw the little guys? It's highly unfair to give one author life+50 because she's well known and give another publication+50 because he's not and "there's [some] chance [his name] is a pseudonym". Anyway, I have a hard time believing that the Paul French novels are going to be in the public-domain in Canada in a decade because Isaac Asimov used a pseudonym when writing them. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From shimmin at uiuc.edu Wed Jan 12 14:11:45 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Wed Jan 12 14:11:56 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: <20050112220206.825804BDAB@ws1-1.us4.outblaze.com> References: <20050112220206.825804BDAB@ws1-1.us4.outblaze.com> Message-ID: <41E5A0A1.3030301@uiuc.edu> > So copyright law in Canada is designed to screw the little guys? It's > highly unfair to give one author life+50 because she's well known and > give another publication+50 because he's not and "there's [some] chance > [his name] is a pseudonym". Anyway, I have a hard time believing that > the Paul French novels are going to be in the public-domain in Canada in > a decade because Isaac Asimov used a pseudonym when writing them. Exactly. A pseudonymous work is only pub+50 while the author's actual identity is not publicly known: 6.1 Except as provided in section 6.2, where the identity of the author of a work is unknown, copyright in the work shall subsist for whichever of the following terms ends earlier: (a) a term consisting of the remainder of the calendar year of the first publication of the work and a period of fifty years following the end of that calendar year, and (b) a term consisting of the remainder of the calendar year of the making of the work and a period of seventy-five years following the end of that calendar year, but where, during that term, the author's identity becomes commonly known, the term provided in section 6 applies. -- RS From hyphen at hyphenologist.co.uk Thu Jan 13 14:13:11 2005 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Wed Jan 12 14:13:42 2005 Subject: [gutvol-d] Andale Mono font In-Reply-To: <001001c4f8ee$a29b2d00$d49495ce@gw98> References: <001001c4f8ee$a29b2d00$d49495ce@gw98> Message-ID: On Wed, 12 Jan 2005 16:33:52 -0500, "N Wolcott" wrote: | I used to have the Andale Mono ttf font, but since I reinstalled windows and Office it has disappeared. Where did it come from ? Font has all distinguishable characters. | N Wolcott nwolcott2@post.harvard.edu I love Andale Mono, which came from M$. I still have a copy have emailed it to you. -- Dave F From shalesller at writeme.com Wed Jan 12 14:21:59 2005 From: shalesller at writeme.com (D. Starner) Date: Wed Jan 12 14:22:09 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <20050112222200.0663F4BDAB@ws1-1.us4.outblaze.com> Robert Shimmin writes: > Exactly. A pseudonymous work is only pub+50 while the author's actual identity is not publicly > known: > > 6.1 Except as provided in section 6.2, where the identity of the author of a work is unknown, > copyright in the work shall subsist for whichever of the following terms ends earlier: > > (a) a term consisting of the remainder of the calendar year of the first publication of the work > and a period of fifty years following the end of that calendar year, and > > (b) a term consisting of the remainder of the calendar year of the making of the work and a period > of seventy-five years following the end of that calendar year, > > but where, during that term, the author's identity becomes commonly known, the term provided in > section 6 applies. What does this mean for Project Gutenberg Canada? How much research is needed before they could conclude that the identity of an author is not commonly known? -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From nwolcott at dsdial.net Wed Jan 12 14:49:19 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Wed Jan 12 14:49:45 2005 Subject: [gutvol-d] Andale Mono font References: <001001c4f8ee$a29b2d00$d49495ce@gw98> Message-ID: <002601c4f8f8$fa8032e0$d49495ce@gw98> As an answer to my own question and also thanks to Dave Fawthrop, Google informed me (after I couldn't find it on the MS site that Andale and Georgia are no longer supplied by MS. I now keep a copy of all my fonts somewhere else than Windows so I can retrieve them later. It turns out that Andale formerly Monotype mono in W3.1 was included in IE4 and IE5 but discontinued with IE6. I still have the old subscription freebies from AOL etc that had IE5 and IE4 on them. They have the fonts in CAB files. I managed to extract the "supplementary Fonts" from the IE5 setup disk using WINRAR, a shareware program which busts out cab files. Georgia was also there. Apparently around 2000 MS offered also unicode versions of the core fonts, I don't know if they are still available, called "Fonts 2000". The DP font is a little more distinctive and ugly for proofing purposes, but Andale does the trick too and looks better. Moral--never throw anything away! ----- Original Message ----- From: N Wolcott To: Project Gutenberg Volunteer Discussion Sent: Wednesday, January 12, 2005 4:33 PM Subject: [gutvol-d] Andale Mono font I used to have the Andale Mono ttf font, but since I reinstalled windows and Office it has disappeared. Where did it come from ? Font has all distinguishable characters. N Wolcott nwolcott2@post.harvard.edu ------------------------------------------------------------------------------ _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050112/58329142/attachment.html From hacker at gnu-designs.com Wed Jan 12 15:06:15 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Wed Jan 12 15:07:28 2005 Subject: [gutvol-d] Andale Mono font In-Reply-To: <002601c4f8f8$fa8032e0$d49495ce@gw98> References: <001001c4f8ee$a29b2d00$d49495ce@gw98> <002601c4f8f8$fa8032e0$d49495ce@gw98> Message-ID: > I don't know if they are still available, called "Fonts 2000". The > DP font is a little more distinctive and ugly for proofing purposes, > but Andale does the trick too and looks better. Why not use one that is freely available, and patent-free? There are several great mono fonts available in the Freefont and Bitstream Vera collections. In fact, I used those exact two collections to create my "Plucker Anti-aliased fonts" package, which you can see in full splendor here: http://code.plkr.org/aa/ David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From sly at victoria.tc.ca Wed Jan 12 15:56:15 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jan 12 15:56:27 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: <20050112222200.0663F4BDAB@ws1-1.us4.outblaze.com> References: <20050112222200.0663F4BDAB@ws1-1.us4.outblaze.com> Message-ID: On Wed, 12 Jan 2005, D. Starner wrote: > > What does this mean for Project Gutenberg Canada? How much research is needed before they could > conclude that the identity of an author is not commonly known? > That's a good question. It brings up the issue of what kind of copyright clearance is appropriate or neccessary for texts heading into a PG of Canada collection. Andrew From j.hagerson at comcast.net Wed Jan 12 16:43:09 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Wed Jan 12 16:43:26 2005 Subject: [gutvol-d] Perl help needed Message-ID: <006501c4f908$de215ec0$6401a8c0@sarek> I have no experience at all with perl and I need to parse some information from the PG catalog.rdf file. If someone is willing to assist me, I would appreciate it. Thank you. John Hagerson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050112/82f87759/attachment-0001.html From ag737 at freenet.carleton.ca Wed Jan 12 18:43:55 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 18:44:09 2005 Subject: [gutvol-d] Bibliographic data Message-ID: <1c29ac1c0745.1c07451c29ac@ncf.ca> ----- Original Message ----- >From "D. Starner" Date Wed, 12 Jan 2005 13:54:15 -0800 To gutvol-d@lists.pglaf.org Subject [gutvol-d] Re: gutvol-d Digest, Vol 6, Issue 27 >> What would you rather do: >> >> look up the publication dates for 1001 books and the death dates of the >> author(s) for a large subset thereof? >I don't have to look up death dates. This is as much a complaint that >it's different rather than it's messier. The US law *IS* messier. I have to work in both. I'll take life+ any day. >> look up just the death dates for < 1001 authors? >I found "Oklahoma, and other poems" in my library. It took 30 seconds to >verify that it was in the public domain in the US. It took 20 minutes of >searching by library staff to find out that he died in 1951. Either way, it's easier to look up <1001 -- probably around 750 -- author's death dates (and not get some) than it is to look up 1001 -- no less -- publication dates (and not get some of them, either). >You complain that a book you have doesn't have a publication date. In my >experience, that's a rarity Not in mine, unfortunately. It's very common with 19th-century English and American imprints, and extremely common with continental ones before WWII. >, and I still fail to find to understand why >European publishers do that. If you don't have a publication date, how >do you know that it's not a modern edition, and hence has typographical >copyright in the EU, or even new editorial content? I don't. And I don't care. I'm just going by the title of the original and considering the copyright in the original for the purposes of this experiment. I barely have time to do this experiment with the variables I've been given and the variables I've set for myself; buggered if I'll add more to the mix. >How do you prove that >to the satisfaction of PG, who might have to prove it in court someday? What does this have to do with PG proving anything in court? This is about the narrow question of considering the impact of life+50 vs. life+70 term on the list as it has been provided to us by a third party. >> There are more books than authors. Just by confirming the death dates >> for CS Lewis, Charles Dickens, Hans Christian Anderson, Verdi, etc., >> gives me the public domain date for multiple books at a go. >It's not that simple. I'm assuming, for the sake of argument, that it is. Translators? New editions? Typographical arrangements? I'm blind to them for the purpose of this analysis. Do they have an impact on what PG or anyone else can do with those works? Yes. Do I care? Not for the purpose of this little thought-experiment, no. >Just because Baudelaire died before 1935, doesn't >mean the translator for Les Fleurs du Mal died before 1935, I am only considering copyright in the original, pre-humous, untranslated edition of any of the works on the list. >Honestly, just confirming the death dates for most authors tells you that >all of thier works are in the public domain in the US. The exception, >works published after their death or still unpublished Or works published after 1922. And I don't care to look up TWO data points on many books; I'd rather just look up one data point, and recycle as many of those lookups as I can. You are welcome to do the same analysis of that list of 1001 under US law. But I'm not going to. From ag737 at freenet.carleton.ca Wed Jan 12 18:50:10 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 18:50:24 2005 Subject: [gutvol-d] Anonymous/pseudonymous term Message-ID: <1c0e111c0697.1c06971c0e11@ncf.ca> ----- Original Message ----- >From "D. Starner" Date Wed, 12 Jan 2005 14:02:06 -0800 To "Project Gutenberg Volunteer Discussion" Subject Re: [gutvol-d] Canadian and American copyrights >So copyright law in Canada is designed to screw the little guys? Huh? >It's highly unfair to give one author life+50 because she's well known and >give another publication+50 because he's not and "there's [some] chance >[his name] is a pseudonym". If the name is DEFINITELY a pseudonym (or the work is anonymous), and the true identity of the author does not, in the meantime, become known, then publication+50 is the rule. Publication+[regular term] is a common rule, around the world, for anonymous or pseudonymous works. Since the regular term of copyright is keyed to the life of the author in Berne countries, if the author wants to give his/her heirs the benefit of the posthumous term of copyright, then they can do so by taking credit for their work. However, since third parties must also rely on the term of copyright rules to determine what they, as third parties, are entitled to do with a work, it would be unfair to the public if they (we) were estopped from using an ancient work simply because no one knew who the author was, to calculate the term, to decide whether the work is public domain or not. >Anyway, I have a hard time believing that >the Paul French novels are going to be in the public-domain in Canada in >a decade because Isaac Asimov used a pseudonym when writing them. If Isaac Asimov is known to be the true author of works under the pseudonym "Paul French", then the regular term applies: the work is no longer pseudonymous. What's really unfair is the idiotic 95-year term under the US code for anonymous or pseudonymous works. From ag737 at freenet.carleton.ca Wed Jan 12 18:55:32 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 18:55:46 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <1c47d61c1b72.1c1b721c47d6@ncf.ca> >What does this mean for Project Gutenberg Canada? How much research is needed before >they could conclude that the identity of an author is not commonly known? Most "known" pseudonyms are either catalogued in the national union catalogue, or available through dead-tree reference sources. If searches of those two sources fail, then a letter of inquiry to one of the literary copyright collectives would probably suffice to act as a cover-your-ass. If these three lines of attack did not uncover the true identity of the author of an anonymous/pseudonymous work, then a court would probably conclude that the anon/psued term, not the life+ term, would apply, or was reasonably relied on by a third party re-using that work. This is a section, along with its predecessors, I have flagged to see if there's been any judicial comment on since the 1924 act was passed. From ag737 at freenet.carleton.ca Wed Jan 12 18:57:31 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Wed Jan 12 18:57:45 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <1bf7101c4825.1c48251bf710@ncf.ca> > That's a good question. It brings up the issue of what kind of > copyright clearance is appropriate or neccessary for texts heading > into a PG of Canada collection. Or any PG of [COUNTRY WITH LIFE+} collection, for that matter. The benefit, of course, is that once the death date of an author is conclusively determined, then, barring wrinkles like collaborators, translators, co-authors, illustrators, and posthumous publication, all of that author's works are, by definition, cleared. From brad at chenla.org Wed Jan 12 20:43:27 2005 From: brad at chenla.org (Brad Collins) Date: Wed Jan 12 20:46:21 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: (Andrew Sly's message of "Wed, 12 Jan 2005 00:46:29 -0800 (PST)") References: <20050111221103.229AD4BDAB@ws1-1.us4.outblaze.com> Message-ID: Andrew Sly writes: > And you also have the uncertain situations. Here is > a description from the LoC of a title that have > availible, if I want it: > > LC Control Number: 32012123 > Type of Material: Text (Book, Microform, Electronic, etc.) > Personal Name: Powell, Van. [from old catalog] > Main Title: The mystery crash, > Published/Created: New York, Chicago, A. L. Burt company [c1932] > Description: 285 p. front. 19 cm. > LC Classification: PZ7.P88 Sk no. 1 > Interesting. LC Control Number: no2003090172 HEADING: Powell, Van 000 00427cz 2200145n 450 001 6046275 005 20030910113802.0 008 030909n| acannaabn |n aaa c 010 __ |a no2003090172 035 __ |a (OCoLC)oca06147918 040 __ |a LU |b eng |c LU |d DLC 100 1_ |a Powell, Van 670 __ |a His The mystery crash, 1932: |b t.p. (Van Powell) 670 __ |a LC Online Catalog, Sept. 8, 2003 |b (hdg.: Powell, Van; usage: Van Powell) 952 __ |a yz00 According to http://www.geocities.com/jjnevins/pulpss.html Sky Scout. The Sky Scout was created by A. Van Buren Powell and appeared in the four-book "Sky Scout" series, which began in 1932 with The Mystery Crash. The Sky Scout was an air detective. This is interesting because, according to LOC Authority "A. Van Buren Powell" is different from "Van Powell". The LOC Authority Heading Sez: LC Control Number: no 98132725 HEADING: Powell, A. Van Buren (Ardon Van Buren), b. 1886 000 00583nz 2200181n 450 001 916286 005 19981223051753.4 008 981222n| acannaab |a aaa c 010 __ |a no 98132725 035 __ |a (OCoLC)oca04886541 035 __ |a (DLC)no 98132725 040 __ |a MdU |c MdU 100 10 |a Powell, A. Van Buren |q (Ardon Van Buren), |d b. 1886 400 10 |a Powell, Ardon Van Buren, |d b. 1886 670 __ |a Call of the clouds, c1940: |b t.p. (A. Van Buren Powell) 670 __ |a LC PREM file |b (hdg.: Powell, Ardon Van Buren, 1886- ; usage: A. Van Buren Powell) 953 __ |a xx00 985 __ |c OCLC |e LSPC Kingkong (http://www.kingkong.demon.co.uk/ngcoba/po.htm) turns up: (Ardon) Van (Buren) POWELL {US} (M: 1886 - ?) (&ps: David O'HARA) The Mystery Boys And The Inca Gold [f|1931] The Mystery Boys And Captain Kidd's Message [f|1931] The Mystery Boys And The Secret Of The Golden Sun [f|1931] The Mystery Boys And The Chinese Jewels [f|1931] The Mystery Boys And The Hindu Treasure [f|1931] The Mystery Crash [f|1932] The Haunted Hangar [f|1932] The Vanishing Air Liner [f|1932] The Ghost Of Mystery Airport [f|1932] Jimmie Drury, Candid Camera Detective (ps: David O'HARA) [f|1938] What The Dark Room Revealed (ps: David O'HARA) [f|1939] Caught By The Camera (ps: David O'HARA) [f|1939] By Bursting Flash Bulbs (ps: David O'HARA) [f|1941] Who also appears to have written: BUD BRIGHT, BOY DETECTIVE -- 1929. Penn. BUD BRIGHT AND THE BANK ROBBERS -- 1929. Penn. BUD BRIGHT AND THE KIDNAPERS -- 1930. Penn. BUD BRIGHT AND THE DRUG RING -- 1931. Penn. BUD BRIGHT AND THE COUNTERFEITERS -- 1931. Penn. Are these two people the same? I have no idea. They both wrote about the same time in simular genres. Even if they are, we still only have a birth date and no death date. It just goes to show how difficult it is to track down dates for authority records, and LOC does make mistakes from time to time. b/ -- Brad Collins , Bangkok, Thailand From shalesller at writeme.com Wed Jan 12 22:41:44 2005 From: shalesller at writeme.com (D. Starner) Date: Wed Jan 12 22:42:01 2005 Subject: [gutvol-d] Bibliographic data Message-ID: <20050113064144.3BC424BDAB@ws1-1.us4.outblaze.com> "Wallace J.McLean" writes: > The benefit, of course, is that once the death date of an author is > conclusively determined, then, barring wrinkles like collaborators, > translators, co-authors, illustrators, and posthumous publication, all > of that author's works are, by definition, cleared. The catch is, you don't get to bar wrinkles like that. You have to deal with them. By the same rule of thumb, any book printed before 1923 is, by definition, cleared. Go US! > However, since third parties must also > rely on the term of copyright rules to determine what they, as third > parties, are entitled to do with a work, it would be unfair to the > public if they (we) were estopped from using an ancient work simply > because no one knew who the author was, to calculate the term, to > decide whether the work is public domain or not. In other words, there has to be special exceptions in the law because it's messier than just printed by date. > > I found "Oklahoma, and other poems" in my library. It took 30 seconds to > > verify that it was in the public domain in the US. It took 20 minutes of > > searching by library staff to find out that he died in 1951. > Either way, it's easier to look up <1001 -- probably around 750 -- > author's death dates (and not get some) than it is to look up 1001 -- > no less -- publication dates (and not get some of them, either). There is a grand total of one book that I've looked into and not been able to definitely pin down a publication date, and that was cleared anyway. I can walk the shelves of my library and check to see whether most books are in the public domain just by opening the cover. I can't do that with a life+years system; in that case, I have to remember when they died. > > You complain that a book you have doesn't have a publication date. In my > > experience, that's a rarity > > Not in mine, unfortunately. It's very common with 19th-century English > and American imprints, and extremely common with continental ones > before WWII. A American imprint without a date is in the public domain so long as you can prove it was printed before 1989. Go US! > > It's not that simple. > I'm assuming, for the sake of argument, that it is. Translators? New > editions? Typographical arrangements? I'm blind to them for the purpose > of this analysis. Do they have an impact on what PG or anyone else can > do with those works? Yes. Do I care? Not for the purpose of this little > thought-experiment, no. Then you're stacking the deck. Anyone can win if they do that. > > Honestly, just confirming the death dates for most authors tells you that > > all of thier works are in the public domain in the US. The exception, > > works published after their death or still unpublished > Or works published after 1922. No, most authors died before 1923 and hence anything published after 1922 is in the public domain. > And I don't care to look up TWO data > points on many books; I'd rather just look up one data point, and > recycle as many of those lookups as I can. It's easier just to skip the whole thing altogether. If you're doing this for PG, the question shouldn't be what's easier, it should be what's useful to PG, which is US copyright law and publication dates. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From bruce at zuhause.org Thu Jan 13 06:56:14 2005 From: bruce at zuhause.org (Bruce Albrecht) Date: Thu Jan 13 06:56:20 2005 Subject: [gutvol-d] Canadian and American copyrights In-Reply-To: <1bf7101c4825.1c48251bf710@ncf.ca> References: <1bf7101c4825.1c48251bf710@ncf.ca> Message-ID: <16870.35854.91677.856806@celery.zuhause.org> Wallace J.McLean writes: > > That's a good question. It brings up the issue of what kind of > > copyright clearance is appropriate or neccessary for texts heading > > into a PG of Canada collection. > > > Or any PG of [COUNTRY WITH LIFE+} collection, for that matter. > > The benefit, of course, is that once the death date of an author is > conclusively determined, then, barring wrinkles like collaborators, > translators, co-authors, illustrators, and posthumous publication, all > of that author's works are, by definition, cleared. And what about the myriad series that were basically contracted by the publisher, and each one was "authored" by the publisher's chosen pen name? Is that considered anonymously authored? From jeroen.mailinglist at bohol.ph Thu Jan 13 10:50:19 2005 From: jeroen.mailinglist at bohol.ph (Jeroen Hellingman (Mailing List Account)) Date: Thu Jan 13 10:49:18 2005 Subject: [gutvol-d] Anonymous/pseudonymous term In-Reply-To: <1c0e111c0697.1c06971c0e11@ncf.ca> References: <1c0e111c0697.1c06971c0e11@ncf.ca> Message-ID: <41E6C2EB.1030200@bohol.ph> Wallace J.McLean wrote: >What's really unfair is the idiotic 95-year term under the US code for >anonymous or pseudonymous works. > > > > Which means everybody with a life expectancy below 15 years in the future (i.e. above the age of 75 or so), better forsake placing his name on his work! So far moral rights of attribution.... However, in real, economical sense, the present day value difference between 70 years copyright and 95 years copyright is a few dollars at most, even for a would-be classic. Jeroen Hellingman From ag737 at freenet.carleton.ca Thu Jan 13 12:14:03 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Thu Jan 13 12:14:12 2005 Subject: [gutvol-d] Bibliographic data Message-ID: <1d59621da32d.1da32d1d5962@ncf.ca> ----- Original Message ----- >From "D. Starner" Date Wed, 12 Jan 2005 22:41:44 -0800 To "Project Gutenberg Volunteer Discussion" Subject Re: [gutvol-d] Bibliographic data "Wallace J.McLean" writes: >> The benefit, of course, is that once the death date of an author is >> conclusively determined, then, barring wrinkles like collaborators, >> translators, co-authors, illustrators, and posthumous publication, all >> of that author's works are, by definition, cleared. >The catch is, you don't get to bar wrinkles like that. You have to deal >with them. By the same rule of thumb, any book printed before 1923 is, >by definition, cleared. Go US! And they are dealwithable on the basis of intrinsic evidence, intrinsic to the book. I'll take a post-1923 public domain any day, thank you very much. I'm currently working on the oeuvres of an author who died in 1939, so that the text, at least, of all his books, are public domain in Canada. (Some of the illustrations are by an illustrator who died in the 1970s; the illustrations aren't clearable, but can be excised.) However, about half of his works are post-1923 imprints. Boo, U.S. >> However, since third parties must also >> rely on the term of copyright rules to determine what they, as third >> parties, are entitled to do with a work, it would be unfair to the >> public if they (we) were estopped from using an ancient work simply >> because no one knew who the author was, to calculate the term, to >> decide whether the work is public domain or not. >In other words, there has to be special exceptions in the law because >it's messier than just printed by date. The US "just printed by date" rule won't be around forever: assuming no further changes in the US law, you'll have to start dealing with many of the same questions starting in 2019... and you'll STILL have a smaller public domain to play with. > There is a grand total of one book that I've looked into and not been able > to definitely pin down a publication date, and that was cleared anyway. You've been extremely lucky. In my personal library, I'd say 5 to 10 percent of my historical collection has indeterminate publication dates. > I can walk the shelves of my library and check to see whether most books > are in the public domain just by opening the cover. I can't do that with > a life+years system; in that case, I have to remember when they died. And you still have to do that for every single book; you can't batch- clear. > A American imprint without a date is in the public domain so long as you > can prove it was printed before 1989. Go US! I'm not certain that's a correct statement of the law. Are you? >> I'm assuming, for the sake of argument, that it is. Translators? New >> editions? Typographical arrangements? I'm blind to them for the purpose >> of this analysis. Do they have an impact on what PG or anyone else can >> do with those works? Yes. Do I care? Not for the purpose of this little >> thought-experiment, no. >Then you're stacking the deck. Anyone can win if they do that. Not at all. I'm just making my life easier. Why are you being a prick? >>> Honestly, just confirming the death dates for most authors tells you that >>> all of thier works are in the public domain in the US. The exception, >>> works published after their death or still unpublished >> Or works published after 1922. >No, most authors died before 1923 and hence anything published after 1922 is >in the public domain. Oh? Tell that to the PG clearance team. > It's easier just to skip the whole thing altogether. If you're doing > this for PG, the question shouldn't be what's easier, it should be > what's useful to PG, which is US copyright law and publication dates. I'm doing it for me. Someone else can do it for PG under US law. Are you volunteering? Thanks. From ag737 at freenet.carleton.ca Thu Jan 13 12:18:05 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Thu Jan 13 12:18:15 2005 Subject: [gutvol-d] Canadian and American copyrights Message-ID: <1d5f4f1d5dcc.1d5dcc1d5f4f@ncf.ca> Wallace J.McLean writes: >> The benefit, of course, is that once the death date of an author is >> conclusively determined, then, barring wrinkles like collaborators, >> translators, co-authors, illustrators, and posthumous publication, all >> of that author's works are, by definition, cleared. >And what about the myriad series that were basically contracted by the >publisher, and each one was "authored" by the publisher's chosen pen >name? Is that considered anonymously authored? No, that's considered pseudonymously authored! If the true identity of the author is not, has not been, and cannot reasonably be known, then the publication+50 rule kicks in. This was recently confirmed under Canadian law in a decision by the Copyright Board. If there's authority for the statement that John Smith was the true author of "Johnny and the Skyship" by "Charles Vander Pelt", published in 1939, then the life span of John Smith determines the duration of the copyright. If John Smith died in or before 1954, the work is public domain in Canada; if he died after 1954, it is still under copyright. If, on the other hand, there is no authority anywhere for the true identity of the author of "Johnny and the Skyship" by "Charles Vander Pelt", published in 1939, then the work is public domain under Canadian law, and has been since January 1, 1990. The work is NOT public domain in the US or any life+70 country that also has publication+70 for pseud/anon works. From marcello at perathoner.de Thu Jan 13 10:29:52 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jan 13 13:19:06 2005 Subject: [gutvol-d] Perl help needed In-Reply-To: <006501c4f908$de215ec0$6401a8c0@sarek> References: <006501c4f908$de215ec0$6401a8c0@sarek> Message-ID: <41E6BE20.7010409@perathoner.de> John Hagerson wrote: > I have no experience at all with perl and I need to parse some information > from the PG catalog.rdf file. There are some examples at: www.gutenberg.org/feeds/examples/ -- Marcello Perathoner webmaster@gutenberg.org From shimmin at uiuc.edu Thu Jan 13 13:23:56 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Thu Jan 13 13:24:08 2005 Subject: [gutvol-d] Bibliographic data In-Reply-To: <1d59621da32d.1da32d1d5962@ncf.ca> References: <1d59621da32d.1da32d1d5962@ncf.ca> Message-ID: <41E6E6EC.1030504@uiuc.edu> > The US "just printed by date" rule won't be around forever: assuming no > further changes in the US law, you'll have to start dealing with many > of the same questions starting in 2019... and you'll STILL have a > smaller public domain to play with. Actually, when the US public domain begins to grow again in 2019, it will still be on a fixed-term system for a long time to come. No life+X copyrights were issued in the United States until January 1, 1978. > And you still have to do that for every single book; you can't batch- > clear. You have to do it for every single book under a life + system, too: you have to examine the book to see who the author is before you can refer to your list of death-dates. Of course, this is a silly argument to begin with, because regardless of the copyright regime, verifying public domain status is usually going to be a negligible effort compared to actually digitizing the book. > A American imprint without a date is in the public domain so long as > you can prove it was printed before 1989. Go US! > > I'm not certain that's a correct statement of the law. Are you? It's not exactly a correct statement of the law, but you won't get into too much trouble following it. More correctly, before the effective date of the Berne Convention Implementation Act (March 1, 1989), if a book was published in the United States, by the authority of the copyright holder, and did not bear proper notice of copyright (which includes a date), it invalidated copyright in the work. Copyrights on works of foreign authors that lost copyright by this provision were restored on January 1, 1996, but the possession of a copy without the copyright notice, published before March 1, 1989, is usually enough to base an "innocent infringement" defense on. -- RS From j.hagerson at comcast.net Thu Jan 13 16:52:31 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Thu Jan 13 16:52:45 2005 Subject: [gutvol-d] Perl help needed In-Reply-To: <41E6BE20.7010409@perathoner.de> Message-ID: <006d01c4f9d3$57583230$6401a8c0@sarek> This is true, and the examples might even be sufficient, if I knew anything about perl. However, I don't know anything about perl. The perl resources I have found have been the equivalent of offering a course in metallurgy to someone asking for driving directions. It's ordinarily not necessary to know how to build a car to drive one. Here's what I have done. I downloaded a ActivePerl from ActiveState (yes, I am running Windows). I tried to use the first example from the link given below (rdf-parse-example.pl) and was told that I don't have XML/LibXML. At this point, I'm stuck. Thank you. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Marcello Perathoner wrote: John Hagerson wrote: > I have no experience at all with perl and I need to parse some information > from the PG catalog.rdf file. There are some examples at: www.gutenberg.org/feeds/examples/ From hacker at gnu-designs.com Thu Jan 13 16:59:58 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Thu Jan 13 17:00:36 2005 Subject: [gutvol-d] Perl help needed In-Reply-To: <006d01c4f9d3$57583230$6401a8c0@sarek> References: <006d01c4f9d3$57583230$6401a8c0@sarek> Message-ID: > Here's what I have done. I downloaded a ActivePerl from ActiveState > (yes, I am running Windows). I tried to use the first example from > the link given below (rdf-parse-example.pl) and was told that I > don't have XML/LibXML. Install that from CPAN, or whatever the Windows equivalent tool to query CPAN is. PPM I think? David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From j.hagerson at comcast.net Thu Jan 13 18:03:36 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Thu Jan 13 18:03:51 2005 Subject: [gutvol-d] perl helper found Message-ID: <007201c4f9dd$452cd070$6401a8c0@sarek> Thank you! John Hagerson From phil at hitchcock99.freeserve.co.uk Fri Jan 14 00:56:51 2005 From: phil at hitchcock99.freeserve.co.uk (Phil Hitchcock) Date: Fri Jan 14 08:08:00 2005 Subject: [gutvol-d] Preferred diacritical mark Message-ID: <003d01c4fa53$06e33680$b0a3883e@freeserve.co.uk> I am currently preparing e-text versions of "W H Sleeman's, Rambles and Recollections of an Indian Official". The book describes life and customs in India in the 1830's. Many of the place names, personal names, and various other words have a dash - placed over an a, e, i, or u, to indicate a long vowel. When I produce the 7-bit ASCII plain text, these marks will be missing; in the 8-bit ASCII version I am planning to use a circumflex accent ^ to replace the diacritical mark. However in a HTML version I could use the circumflex accent, or I could use the Unicode series starting with Ā to give a vowel with a dash over it, thus reproducing the original text form. However I have seen some present day publications using the circumflex accent on Indian place names. Thus, I am wondering what the Project Gutenberg preferred form would be for the diacritical mark in the HTML version. Philip Hitchcock Hertfordshire, UK. -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.298 / Virus Database: 265.6.11 - Release Date: 12/01/05 From sly at victoria.tc.ca Fri Jan 14 09:38:54 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Fri Jan 14 09:39:00 2005 Subject: [gutvol-d] Preferred diacritical mark In-Reply-To: <003d01c4fa53$06e33680$b0a3883e@freeserve.co.uk> References: <003d01c4fa53$06e33680$b0a3883e@freeserve.co.uk> Message-ID: I believe the combined experience of many PG volunteers preparings texts for PG has shown that the best course is: "preserve what is used in the original text" We don't don't change old spellings of place names in English to "correct" them, so why do so in another language? Personally, I would leave the marks out of the 8-bit text, as the original characters cannot be reproduced using ISO-Latin-1. You may want to also consider making a unicode plain text file. See "Through the Mackenzie Basin: A Narrative of the Athabasca and Peace River Treaty Expedition of 1899" (http://www.gutenberg.org/etext/12569) for a similar example of the author rendering native north american proper names with accents over some letters. (in this case acute accents over consonants.) Thanks, Andrew On Fri, 14 Jan 2005, Phil Hitchcock wrote: > I am currently preparing e-text versions of "W H Sleeman's, Rambles and > Recollections of an Indian Official". The book describes life and customs in > India in the 1830's. > > Many of the place names, personal names, and various other words have a > dash - placed over an a, e, i, or u, to indicate a long vowel. > > When I produce the 7-bit ASCII plain text, these marks will be missing; in > the 8-bit ASCII version I am planning to use a circumflex accent ^ to > replace the diacritical mark. > > However in a HTML version I could use the circumflex > accent, or I could use the Unicode series starting with Ā to give a > vowel with a dash over it, thus reproducing the original text form. However > I have seen some present day publications using the circumflex accent on > Indian place names. > > Thus, I am wondering what the Project Gutenberg preferred form would be for > the diacritical mark in the HTML version. > > > > > Philip Hitchcock > Hertfordshire, UK. > > > > From JHagerson at ftportfolios.com Fri Jan 14 07:05:09 2005 From: JHagerson at ftportfolios.com (Hagerson, John) Date: Fri Jan 14 10:07:34 2005 Subject: [gutvol-d] Wired Magazine "gets it" on copyright Message-ID: http://www.wired.com/wired/archive/13.01/view.html?pg=5 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050114/719e6fd0/attachment-0001.html From shalesller at writeme.com Fri Jan 14 16:39:04 2005 From: shalesller at writeme.com (D. Starner) Date: Fri Jan 14 16:39:17 2005 Subject: [gutvol-d] Preferred diacritical mark Message-ID: <20050115003904.A54D14BDAB@ws1-1.us4.outblaze.com> "Phil Hitchcock" writes: > I am currently preparing e-text versions of "W H Sleeman's, Rambles and > Recollections of an Indian Official". The book describes life and customs in > India in the 1830's. > > Many of the place names, personal names, and various other words have a > dash - placed over an a, e, i, or u, to indicate a long vowel. > > When I produce the 7-bit ASCII plain text, these marks will be missing; in > the 8-bit ASCII version I am planning to use a circumflex accent ^ to > replace the diacritical mark. > > However in a HTML version I could use the circumflex > accent, or I could use the Unicode series starting with Ā to give a > vowel with a dash over it, thus reproducing the original text form. However > I have seen some present day publications using the circumflex accent on > Indian place names. > > Thus, I am wondering what the Project Gutenberg preferred form would be for > the diacritical mark in the HTML version. Always replicate what's in the book. There is sometimes a call for modernized versions, but we should have the original, unmodernized version first. I think using the circumflex in the Latin-1* version to be a suitable replacement, especially as that's what modern users are using, but put a transcriber's note at the top of the document noting that's what you've done. * There is no such thing as 8-bit ASCII. There's only 7-bit ASCII. There are 8-bit extensions to ASCII, literaly hundreds of them. Latin-1 (also known ISO standard 8859, part 1, or ISO-8859-1) is the one that PG usually uses; it's very similar to CP-1252 (for most purposes, a subset of CP-1252), the character set that Windows 95 on uses for Western Europe, and most likely you're most familiar with one of those two. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From stephen.thomas at adelaide.edu.au Fri Jan 14 20:40:22 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Fri Jan 14 20:40:44 2005 Subject: [gutvol-d] Preferred diacritical mark In-Reply-To: <003d01c4fa53$06e33680$b0a3883e@freeserve.co.uk> References: <003d01c4fa53$06e33680$b0a3883e@freeserve.co.uk> Message-ID: <41E89EB6.9040306@adelaide.edu.au> Phil Hitchcock wrote: >I am currently preparing e-text versions of "W H Sleeman's, Rambles and >Recollections of an Indian Official". The book describes life and customs in >India in the 1830's. > >Many of the place names, personal names, and various other words have a >dash - placed over an a, e, i, or u, to indicate a long vowel. > > The 'dash' over the vowel is known as a macron. A lower-case a with macron can be coded in Unicode with U+0101. As you say, you could code this in HTML as ā -- or you could specify UTF-8 encoding for your document and just paste the character in. (E.g. in Windows, go to "Start/Programs/Accessories/System tools and choose Character Map, and copy-paste the appropriate characters from there.) If you use UTF8 encoding, you can create your document as text, html, RTF, PDF, ... whatever, and you'll have the correct character in all cases. Note that the macron is not the same as a circumflex, so if you're aiming for accurate repro of the original edition, you'll want to use the macron. -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ From brad at chenla.org Fri Jan 14 22:02:13 2005 From: brad at chenla.org (Brad Collins) Date: Fri Jan 14 22:05:08 2005 Subject: [gutvol-d] Dumb copyright question... In-Reply-To: (Steve Herber's message of "Mon, 10 Jan 2005 12:17:24 -0800 (PST)") References: Message-ID: I finally managed to shlump my books from my old home in the jungle near the Mekong back to Bangkok and found an old copy of William Carlos Williams' Spring and All which was first published in 1923. Is the cutoff date in the States the end of 1923 or *before* 1923? WCW's Kora in Hell came out in 1920 so this is certainly clear. But I would love to get Spring and All out to the world. The pure products of America go crazy -- was a big influence on and a precursor to Ginsberg's Howl and the Beat poets in general. Spring and All was also an important starting point for the Black Mountain poets (Charles Olson, Robert Creely and Robert Duncan). Spring and All completely blew me away. I read the opening and was completely hooked: If anything of moment results -- so much the better. And so much the more likely will it be no will want to see it. There is a constant barrier between the reader and his conciousness of immediate contact with the world. If there is an ocean it is here. Or rather the whole world is in between: Yesterday, tomorrow, Europe, Asia, Africa, -- all things removed and impossible, the tower of the church at Seville, the Parthenon. It would be wonderful to help old Doc Williams to blow away a new generation of aspiring poets. If the book is clear I'll begin work.... Cheers, b/ -- Brad Collins , Bangkok, Thailand From sly at victoria.tc.ca Sat Jan 15 00:19:56 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat Jan 15 00:20:18 2005 Subject: [gutvol-d] Dumb copyright question... In-Reply-To: References: Message-ID: Asking on this list about copyright clearance for a particular item will only get you speculation in result. The only way to really know if the book you have in mind can be copyright cleared by PG is to go ahead and send in the tp&v and see what the results are. That said, I'm afraid this title does not look like it stands too good a chance. Anything published _before_ 1923 can almost certainly be cleared by PG. Anything published after 1922 will present much more of a problem, and give that Williams seems to be highly regarded as a poet, it's exceedingly likely that copyright was renewed on his titles. It appears that WIlliams' dates were (1883-1963) which means that his writings would not be PD in a life+50 country either... Andrew On Sat, 15 Jan 2005, Brad Collins wrote: > > I finally managed to shlump my books from my old home in the jungle > near the Mekong back to Bangkok and found an old copy of William > Carlos Williams' Spring and All which was first published in 1923. > > Is the cutoff date in the States the end of 1923 or *before* 1923? > > WCW's Kora in Hell came out in 1920 so this is certainly clear. But > I would love to get Spring and All out to the world. > > The pure products of America > go crazy -- > > was a big influence on and a precursor to Ginsberg's Howl and the Beat > poets in general. Spring and All was also an important starting point > for the Black Mountain poets (Charles Olson, Robert Creely and Robert > Duncan). > > Spring and All completely blew me away. I read the opening and was > completely hooked: > > If anything of moment results -- so much the better. And so much > the more likely will it be no will want to see it. > > There is a constant barrier between the reader and his > conciousness of immediate contact with the world. If there is an > ocean it is here. Or rather the whole world is in between: > Yesterday, tomorrow, Europe, Asia, Africa, -- all things removed > and impossible, the tower of the church at Seville, the Parthenon. > > It would be wonderful to help old Doc Williams to blow away a new > generation of aspiring poets. > > If the book is clear I'll begin work.... > > Cheers, > > b/ > > From shimmin at uiuc.edu Sat Jan 15 04:57:20 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Sat Jan 15 04:57:45 2005 Subject: [gutvol-d] Dumb copyright question... In-Reply-To: References: Message-ID: <41E91330.9050606@uiuc.edu> I did a brief gander at the New General Catalog of Old Books and Authors, and it lists Spring and All as first published in 1922. If you can verify this, it will probably clear. Also, it might be worth your while to establish non-renewal, if you really wanted to get this particular book out. Despite WCW's reputation, I noted that the NGC listed 'The Great American Novel' among books pulished in 1923 that did not have their copyright renewed. This isn't particularly surprising. I've found Pulitzer prize winning novelists who didn't bother to renew their copyrights on their early work. -- RS From jon at noring.name Sat Jan 15 12:37:12 2005 From: jon at noring.name (Jon Noring) Date: Sat Jan 15 12:38:02 2005 Subject: [gutvol-d] Looking for text editor which does ... Message-ID: <616739250.20050115133712@noring.name> Everyone, A basic question... I'd like to get the recommendations of the long-timers here for a Windows-based GUI text editor or utility which cleans up *selected* paragraphs of text (in plain text documents) to create uniform line lengths with hard line breaks. The situation is that I have a large marked-up text document where many paragraphs have varying and (many times) quite long line lengths. For example, a paragraph may consist of three lines, the first may be 250 characters long, the second 50 characters long, and the third 120 characters long -- and I'd like to "regularize" the paragraph with lines exactly 70 characters or less in length (this paragraph is an example of such "regularization".) I'd like to simply select those three lines in the utility, click a button or something, and the text is automagically "regularized" (no hyphenation, one space between words, etc.) It gets quite laborious doing this by hand with my text editor of choice, vi (I use Lemmy, a Windows vi-clone, for most of my text editing needs.) I do NOT want a tool which only globally does this to the whole document (i.e. there are longer lines in the document which I wish to keep unbroken.) And I do NOT want a tool requiring typing in a long command line -- by the time I do that I could regularize the paragraph by hand in my editor. I just want to select and regularize. So, what's out there? Obviously Project Gutenbergers must use various tools to "regularize" paragraphs. (There's no doubt a different word most everyone here uses to describe this process, but I don't know what it is, thus the use of the word "regularize".) Thanks. Jon Noring From ke at gnu.franken.de Sat Jan 15 12:58:31 2005 From: ke at gnu.franken.de (Karl Eichwalder) Date: Sat Jan 15 12:58:30 2005 Subject: [gutvol-d] Re: Looking for text editor which does ... In-Reply-To: <616739250.20050115133712@noring.name> (Jon Noring's message of "Sat, 15 Jan 2005 13:37:12 -0700") References: <616739250.20050115133712@noring.name> Message-ID: Jon Noring writes: > I just want to select and regularize. Press 'Esc q' (or 'Meta-q') in Emacs, if the pointer sits in the paragraph. In .emacs (or _emacs, the init file) set soemthing like this: (setq-default fill-column 72) -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C From nwolcott at dsdial.net Sat Jan 15 12:28:51 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Sat Jan 15 13:02:20 2005 Subject: [gutvol-d] Jules Verne French texts Message-ID: <001601c4fb45$77eaac80$cf9495ce@gw98> Although there are 19 French texts of Jules Verne on PG, there at least 17 more ready to go which only require converting from html to text format. I have identified at least V008, V010 .V014, V016, V017, V020, V024, V037, V043, V039, V047, V052, V055, V060 which are available at http://jv.gilead.org.il/works.html. Use the list at http://www.ibiblio.org/pub/docs/books/sherwood/Voyages_Extraordinaire.htm as a Vxxx check list. Several of these are also on http://www.ebooksgratuits.com/ (which also has many others and from which the above may have been taken) already in Word format. This conversion is not a big job, and no doubt the more technical of you already have software which will do the conversion at a click. It would be nice if PG could have as many of the 65 Voyages Extraordinaires on by March 24, the centenary of Verne's death when many celebrations are planned.. While some of these have trickled on to PG from time to time, a more concerted effort is required to finish the job. N Wolcott nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050115/45ee0610/attachment.html From blondeel at clipper.ens.fr Sat Jan 15 13:30:26 2005 From: blondeel at clipper.ens.fr (Sebastien Blondeel) Date: Sat Jan 15 13:30:36 2005 Subject: [gutvol-d] Jules Verne French texts In-Reply-To: <001601c4fb45$77eaac80$cf9495ce@gw98> References: <001601c4fb45$77eaac80$cf9495ce@gw98> Message-ID: <20050115213026.GB10189@clipper.ens.fr> On Sat, Jan 15, 2005 at 03:28:51PM -0500, N Wolcott wrote: > Although there are 19 French texts of Jules Verne on PG, there at > least 17 more ready to go which only require converting from html to > text format. I have identified at least V008, V010 .V014, V016, V017, How long and difficult for you is this? It may prove much easier and faster for me (depending on your answer). > V020, V024, V037, V043, V039, V047, V052, V055, V060 which are > available at http://jv.gilead.org.il/works.html. Use the list at > http://www.ibiblio.org/pub/docs/books/sherwood/Voyages_Extraordinaire.htm > as a Vxxx check list. Interestingly, this bears copyright notice even on the books it recognizes come from PG: see at the end of http://jv.gilead.org.il/pg/vcen/01.html The answer "this is an automated footer, please ignore" is not an excuse to me. > Several of these are also on http://www.ebooksgratuits.com/ (which > also has many others and from which the above may have been taken) > already in Word format. This conversion is not a big job, and no > doubt the more technical of you already have software which will do I am in touch with ebooksgratuits to help them port their Word format to PG (in XHTML 1.0 Strict and TXT). See the tests at http://www.eleves.ens.fr/home/blondeel/PGDP/ebooksgratuits/ Right now I just found a Unix machine where I can put my conversion software as a CGI because my Perl scripts crash for him in Cygwin for no apparent reason. > the conversion at a click. It would be nice if PG could have as many > of the 65 Voyages Extraordinaires on by March 24, the centenary of > Verne's death when many celebrations are planned.. That should be possible, as long as you / we / I tell him PG would like as much as Jules Verne as possible by March 24th. > While some of these have trickled on to PG from time to time, a more > concerted effort is required to finish the job. I am preparing a long message for this list giving a number of general ideas to improve PG / PGDP, and this is one of them :) From morgad at eclipse.co.uk Sat Jan 15 13:34:03 2005 From: morgad at eclipse.co.uk (dave morgan) Date: Sat Jan 15 13:34:14 2005 Subject: [gutvol-d] Looking for text editor which does ... In-Reply-To: <616739250.20050115133712@noring.name> References: <616739250.20050115133712@noring.name> Message-ID: <9u2ju09te9n99jd5m2vajugj2k348fev69@4ax.com> On Sat, 15 Jan 2005 13:37:12 -0700, Jon Noring wrote: >Everyone, > >A basic question... > >I'd like to get the recommendations of the long-timers here for a >Windows-based GUI text editor or utility which cleans up *selected* >paragraphs of text (in plain text documents) to create uniform line >lengths with hard line breaks. > GuiGuts? http://mywebpages.comcast.net/thundergnat/guiguts.html Dave -- http://www.morgad.no-ip.info/index.html gpg:0x64B5E037 Distributed Proofreaders: http://www.pgdp.net The NTP server pool http://www.pool.ntp.org From brad at chenla.org Sat Jan 15 20:01:50 2005 From: brad at chenla.org (Brad Collins) Date: Sat Jan 15 20:04:47 2005 Subject: [gutvol-d] Re: Looking for text editor which does ... In-Reply-To: (Karl Eichwalder's message of "Sat, 15 Jan 2005 21:58:31 +0100") References: <616739250.20050115133712@noring.name> Message-ID: Karl Eichwalder writes: > Jon Noring writes: > >> I just want to select and regularize. > > Press 'Esc q' (or 'Meta-q') in Emacs, if the pointer sits in the > paragraph. In .emacs (or _emacs, the init file) set soemthing like > this: > > (setq-default fill-column 72) I agree, M-q is very powerful! And if it doesn't do what you need let me know and I'll hack up a elisp script to do what you need. Emacs runs very nicely on Windows as well, and if you install cygwin you then can run any of the unixtools and other scripts on whatever you're working on. b/ -- Brad Collins , Bangkok, Thailand From gbnewby at pglaf.org Sun Jan 16 01:36:26 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Jan 16 01:36:28 2005 Subject: [gutvol-d] Dumb copyright question... In-Reply-To: References: Message-ID: <20050116093626.GB15926@pglaf.org> On Sat, Jan 15, 2005 at 01:02:13PM +0700, Brad Collins wrote: > > I finally managed to shlump my books from my old home in the jungle > near the Mekong back to Bangkok and found an old copy of William > Carlos Williams' Spring and All which was first published in 1923. > > Is the cutoff date in the States the end of 1923 or *before* 1923? Before 1923. Through the end of 1922. -- Greg From traverso at dm.unipi.it Sun Jan 16 03:17:05 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sun Jan 16 03:13:08 2005 Subject: [gutvol-d] Re: Looking for text editor which does ... In-Reply-To: (message from Brad Collins on Sun, 16 Jan 2005 11:01:50 +0700) References: <616739250.20050115133712@noring.name> Message-ID: <200501161117.j0GBH5i12640@posso.dm.unipi.it> >>>>> "Brad" == Brad Collins writes: Brad> Karl Eichwalder writes: >> Jon Noring writes: >> >>> I just want to select and regularize. >> Press 'Esc q' (or 'Meta-q') in Emacs, if the pointer sits in >> the paragraph. In .emacs (or _emacs, the init file) set >> soemthing like this: >> >> (setq-default fill-column 72) Brad> I agree, M-q is very powerful! And if it doesn't do what Brad> you need let me know and I'll hack up a elisp script to do Brad> what you need. fill-region is probably what is needed: select several paragraphs and fill them. I have an elisp macro that fills every paragraph, except the indented lines. And another filling everything except between a markup. Carlo Traverso From joshua at hutchinson.net Sun Jan 16 05:51:42 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Sun Jan 16 05:51:23 2005 Subject: [gutvol-d] Looking for text editor which does ... In-Reply-To: <9u2ju09te9n99jd5m2vajugj2k348fev69@4ax.com> References: <616739250.20050115133712@noring.name> <9u2ju09te9n99jd5m2vajugj2k348fev69@4ax.com> Message-ID: <41EA716E.2030402@hutchinson.net> dave morgan wrote: >On Sat, 15 Jan 2005 13:37:12 -0700, Jon Noring >wrote: > > > >>Everyone, >> >>A basic question... >> >>I'd like to get the recommendations of the long-timers here for a >>Windows-based GUI text editor or utility which cleans up *selected* >>paragraphs of text (in plain text documents) to create uniform line >>lengths with hard line breaks. >> >> >> > >GuiGuts? http://mywebpages.comcast.net/thundergnat/guiguts.html > >Dave > > Let me second this much emphasis. Guiguts is probably the simplest to use and was expressly made for making PGDP texts. Using DP markup, it can even be set to regularize an entire text document but skip sections marked off with /* */ or /# #/ or /$ $/, but this is beyond what you are asking for. It is very easy to select a paragraph or two and then hit the rewrap. Josh From jon at noring.name Sun Jan 16 09:55:01 2005 From: jon at noring.name (Jon Noring) Date: Sun Jan 16 09:55:27 2005 Subject: [gutvol-d] Thanks! (was "Looking for text editor which does ...") In-Reply-To: <41EA716E.2030402@hutchinson.net> References: <616739250.20050115133712@noring.name> <9u2ju09te9n99jd5m2vajugj2k348fev69@4ax.com> <41EA716E.2030402@hutchinson.net> Message-ID: <935883953.20050116105501@noring.name> Josh wrote: > Dave Morgan wrote: >> Jon asked: >>> I'd like to get the recommendations of the long-timers here for a >>> Windows-based GUI text editor or utility which cleans up >>> *selected* paragraphs of text (in plain text documents) to create >>> uniform line lengths with hard line breaks. >> GuiGuts? http://mywebpages.comcast.net/thundergnat/guiguts.html > Let me second this much emphasis. Guiguts is probably the simplest to > use and was expressly made for making PGDP texts. > > Using DP markup, it can even be set to regularize an entire text > document but skip sections marked off with /* */ or /# #/ or /$ $/, but > this is beyond what you are asking for. > > It is very easy to select a paragraph or two and then hit the rewrap. Several of you replied to my request, both in public and in private, and I want to thank each of you for your feedback. It turned out that I used Power Edit 2.1 (the downloadable trial version) to do the job that needed doing at the time. It allowed me to select the exact text and to regularize/rewrap it by pushing a single "hot key" which I specially assigned to this operation (I used the 'F1' key as the "hot key" -- my right hand did the text selection with the mouse, and my left finger pushed 'F1', so I was able to "machine gun" the whole text in no time flat, yet get the fine control I needed.) The adjustable line length was set to the desired value before I began the rewrapping. Power Edit 2.1 also has a fairly nice HTML highlighting feature which is quite flexible in tailoring (although a little buggy since I could not edit the html.syn file to recognize and specially highlight entities: "&...;" -- probably a bug in handling the "&" character.) The other two recommended solutions I looked at was NoteTab Pro 4.95 (the working trial version) and GuiGuts. NoteTab Pro is in many ways a more powerful text editor than Power Edit, and I was very impressed with its feature set. I was even able to add a button to the buttonbar to do the rewrap; however, every time I pushed the rewrap button, it brought up an annoying menu asking me how many characters I want to rewrap the selected text (it was preset already.) This added another unneeded step in the process. Why it doesn't allow one to optionally set this beforehand with no query, as Power Edit does, is sort of strange (maybe it does, but I could not find how to turn this off.) NoteTab Pro also has an HTML highlighting feature, which isn't quite as powerful as that for Power Edit 2.1, but it works (as noted above, I like the ability to highlight the style of the markup, and not only the color -- making the tags bold, for example, helps to see them better.) GuiGuts also worked excellently, and as Dave and Josh noted, is specially designed for the particular needs of PG texts. On the downside for the particular task of rewrapping, GuiGuts does not appear to allow the end-user (although I'm not sure) to reassign hot keys for particular tasks to something more convenient (the rewrap was by default hot-keyed to "Alt-s-r", which is more difficult to enable than pushing a single button -- I would have wanted to temporarily reassign 'Alt-s-r' to 'F1', for example.) Also, it seems like GuiGuts does not highlight markup (again not sure on this), a feature I think is important when editing marked up (versus plain) texts. On the up side, GuiGuts has a lot of cool features which Power Edit and NoteTab Pro don't appear to have. For example, the convenient way GuiGuts will insert Unicode characters (I assume it will save the resulting text as UTF-8 or UTF-16?) It also handles the unusual annoyances found in creating/editing/marking-up PG texts that those who build general text editors are not aware of. Anyway, just my impressions. Thanks again, everyone. Jon Noring From donovan at abs.net Sun Jan 16 10:48:02 2005 From: donovan at abs.net (D Garcia) Date: Sun Jan 16 10:49:44 2005 Subject: [gutvol-d] Looking for text editor which does ... In-Reply-To: <616739250.20050115133712@noring.name> References: <616739250.20050115133712@noring.name> Message-ID: <200501161348.02998.donovan@abs.net> On Saturday 15 January 2005 03:37 pm, Jon Noring wrote: > I'd like to get the recommendations of the long-timers here for a > Windows-based GUI text editor or utility which cleans up *selected* > paragraphs of text (in plain text documents) to create uniform line > lengths with hard line breaks. Programmer's File Editor (obsolete, not maintained) will wrap the paragraph the pointer is in up to or under the line length set in the preferences. Or it will do the same for a selected region as well. F7 (reformat paragraph) is the default hotkey I believe. It also supports macros, so you can set up some tricks of your own. From donovan at abs.net Sun Jan 16 14:10:39 2005 From: donovan at abs.net (D Garcia) Date: Sun Jan 16 14:12:24 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? Message-ID: <200501161710.39963.donovan@abs.net> It would be excellent to have a nightly-updated feed of appropriately anonymized Cleared clearances published by PG in the same or similar format as the Catalog feed. Granted, this is not the same as David Price's list, but it would be beneficial to would-be content providers and take some of the load off of him. There is the drawback that the old 'gbn' clearances would not be in the new database, but it should be possible if there is a willing volunteer able to convert those appropriately to the new format, and I will mention that I have appropriate experience in this area. I can forsee some difficulty in tying old clearances to current PG usernames, though many should be mappable via email address. Thoughts? From gbnewby at pglaf.org Sun Jan 16 14:21:24 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Jan 16 14:21:26 2005 Subject: [gutvol-d] Re: Looking for text editor which does ... In-Reply-To: <200501161117.j0GBH5i12640@posso.dm.unipi.it> References: <616739250.20050115133712@noring.name> <200501161117.j0GBH5i12640@posso.dm.unipi.it> Message-ID: <20050116222124.GD9811@pglaf.org> On Sun, Jan 16, 2005 at 12:17:05PM +0100, Carlo Traverso wrote: > >>>>> "Brad" == Brad Collins writes: > > Brad> Karl Eichwalder writes: > > >> Jon Noring writes: > >> > >>> I just want to select and regularize. > >> Press 'Esc q' (or 'Meta-q') in Emacs, if the pointer sits in > >> the paragraph. In .emacs (or _emacs, the init file) set > >> soemthing like this: > >> > >> (setq-default fill-column 72) > > Brad> I agree, M-q is very powerful! And if it doesn't do what > Brad> you need let me know and I'll hack up a elisp script to do > Brad> what you need. > > fill-region is probably what is needed: select several paragraphs and > fill them. > > I have an elisp macro that fills every paragraph, except the indented > lines. And another filling everything except between a markup. Carlo, I'd love to see these. Maybe there are enough other emacs users on the list to justify sending it to the list - elsewise, just to me. I'd be happy to set up a tools area at pglaf.org. We made a little headway on this way back when, but for now just have links to gutcheck and one or two others. -- Greg From gbnewby at pglaf.org Sun Jan 16 14:25:43 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Jan 16 14:25:43 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? In-Reply-To: <200501161710.39963.donovan@abs.net> References: <200501161710.39963.donovan@abs.net> Message-ID: <20050116222543.GE9811@pglaf.org> On Sun, Jan 16, 2005 at 05:10:39PM -0500, D Garcia wrote: > It would be excellent to have a nightly-updated feed of appropriately > anonymized Cleared clearances published by PG in the same or similar format > as the Catalog feed. > > Granted, this is not the same as David Price's list, but it would be > beneficial to would-be content providers and take some of the load off of > him. This is doable, but what do you want the records for? > There is the drawback that the old 'gbn' clearances would not be in the new > database, but it should be possible if there is a willing volunteer able to > convert those appropriately to the new format, and I will mention that I have > appropriate experience in this area. Everything is in a database now...though the older data are not divided into appropriate fields. > I can forsee some difficulty in tying old clearances to current PG usernames, > though many should be mappable via email address. What do you need usernames for? Generally speaking, we try to keep clearance submitters' personal information "need to know," which many have requested. We're always happy to hook up people with old submitters, via email, when it looks like a cleared item will not really be produced any time soon. Just email me, or David Price, or copyright AT pglaf.org (goes to me & Juliet Sutherland).h -- Greg From stephen.thomas at adelaide.edu.au Sun Jan 16 15:55:51 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Sun Jan 16 15:56:14 2005 Subject: [gutvol-d] Thanks! (was "Looking for text editor which does ...") In-Reply-To: <935883953.20050116105501@noring.name> References: <616739250.20050115133712@noring.name> <9u2ju09te9n99jd5m2vajugj2k348fev69@4ax.com> <41EA716E.2030402@hutchinson.net> <935883953.20050116105501@noring.name> Message-ID: <41EAFF07.5070809@adelaide.edu.au> Jon Noring wrote: > ... > > Power Edit 2.1 also has a fairly nice HTML highlighting feature which > is quite flexible in tailoring (although a little buggy since I could > not edit the html.syn file to recognize and specially highlight > entities: "&...;" -- probably a bug in handling the "&" character.) If you are working on HTML files, how about HTML-Kit (Windows version of Tidy). This does highlighting, will wrap lines to your line length, and also will validate your code, which is more important than line length in HTML. Won't work with plain text of course. ;-) Steve -- Stephen Thomas, Senior Systems Analyst, University of Adelaide Library UNIVERSITY OF ADELAIDE SA 5005 AUSTRALIA Phone: +61 8 830 35190 Fax: +61 8 830 34369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ CRICOS Provider Number 00123M ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information that may be confidential and/or copyright. If you are not the intended recipient please notify the sender by reply email and immediately delete this email. Use, disclosure or reproduction of this email by anyone other than the intended recipient(s) is strictly prohibited. No representation is made that this email or any attachments are free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From traverso at dm.unipi.it Mon Jan 17 01:27:36 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Mon Jan 17 01:23:31 2005 Subject: [gutvol-d] Re: Looking for text editor which does ... In-Reply-To: <20050116222124.GD9811@pglaf.org> (message from Greg Newby on Sun, 16 Jan 2005 14:21:24 -0800) References: <616739250.20050115133712@noring.name> <200501161117.j0GBH5i12640@posso.dm.unipi.it> <20050116222124.GD9811@pglaf.org> Message-ID: <200501170927.j0H9RaD11523@posso.dm.unipi.it> >>>>> "Greg" == Greg Newby writes: Greg> On Sun, Jan 16, 2005 at 12:17:05PM +0100, Carlo Traverso Greg> wrote: >> I have an elisp macro that fills every paragraph, except the >> indented lines. And another filling everything except between a >> markup. Greg> Carlo, I'd love to see these. Maybe there are enough other Greg> emacs users on the list to justify sending it to the list - Greg> elsewise, just to me. Greg> I'd be happy to set up a tools area at pglaf.org. We made a Greg> little headway on this way back when, but for now just have Greg> links to gutcheck and one or two others. -- Greg An old version of my pre- and post-processing tools in emacs are in http://www.dm.unipi.it/~traverso/Ebooks/Lsp/dptools.el ; I am currently revising them (making them unicode-compatible), and better documented. From gbnewby at pglaf.org Mon Jan 17 10:45:10 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jan 17 10:45:12 2005 Subject: [gutvol-d] Dead, down, broken, missing, empty Gutenberg mirrors In-Reply-To: References: Message-ID: <20050117184510.GA1770@pglaf.org> On Sat, Jan 08, 2005 at 10:48:04PM -0500, David A. Desrosiers wrote: > > > I just took a few minutes to check the mirror list (mostly > >because the ibiblio rsync mirror is now so slow, a 9600 modem would > >be faster at sending bits across). > > Incidentally, the Gutenberg rsync page[1] could use a minor > optimization/update to reflect shorter, more-concise rsync options. > > Currently, it states: > > rsync -rlHtSv --delete > > Those can be shortened to: > > # or HavS/SHav if its easier to remember > rsync -avHS --delete > > In my own case, I also use -z and --partial, and exclude many > of the files that aren't necessary to pull across for my mirror (the > DVD and enormous genome datafiles, for example). I've been using this for my mirrors, and it works well. I've updated the mirror-howto accordingly. Thanks! -- Greg From ag737 at freenet.carleton.ca Mon Jan 17 12:18:13 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Mon Jan 17 12:18:21 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? Message-ID: <230177233bdb.233bdb230177@ncf.ca> This has been discussed at PGDP as well. I'm all for it... but I'd go for semi-anonymized: there should be a mechanism whereby people with similar interests can find one another, collaborate, and minimize duplication of effort, after bumping into each other in the virtual stacks. ----- Original Message ----- >From D Garcia Date Sun, 16 Jan 2005 17:10:39 -0500 To gutvol-d@lists.pglaf.org Subject [gutvol-d] Nightly feed of clearance information--Possible? It would be excellent to have a nightly-updated feed of appropriately anonymized Cleared clearances published by PG in the same or similar format as the Catalog feed. From donovan at abs.net Mon Jan 17 14:53:47 2005 From: donovan at abs.net (D Garcia) Date: Mon Jan 17 14:55:35 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? In-Reply-To: <20050116222543.GE9811@pglaf.org> References: <200501161710.39963.donovan@abs.net> <20050116222543.GE9811@pglaf.org> Message-ID: <200501171753.47819.donovan@abs.net> On Sunday 16 January 2005 05:25 pm, Greg Newby wrote: > On Sun, Jan 16, 2005 at 05:10:39PM -0500, D Garcia wrote: > > It would be excellent to have a nightly-updated feed of appropriately > > anonymized Cleared clearances published by PG in the same or similar > > format as the Catalog feed. > > This is doable, but what do you want the records for? The same purpose as David's list, to look up what's In Progress (i.e. "Cleared") so as to avoid duplication of effort. The advantage of PG doing it automatically is *currency*. I know David does the best he can, but it's a huge amount of work and is frequently a month behind. I know from Juliet that recently there have been about 100 clearances being done per day, though that's probably more than usual. Still though, at say 50 a day for a month ... that's a significant lag time where different volunteers could each be getting clearances for the same works. It's happened to me twice recently, and several times before. > Everything is in a database now...though the older data are not > divided into appropriate fields. That's good to know, and simplifies what I'm talking about below: > > I can forsee some difficulty in tying old clearances to current PG > > usernames, though many should be mappable via email address. > > What do you need usernames for? Generally speaking, we try to keep > clearance submitters' personal information "need to know," which many have > requested. *I* don't need to know them. But anyone transforming the old gbn clearances into the new style would need the email on that one to try to match it to the current PG username. Simply an observation from a data manipulation standpoint, and strictly backend. Hope that clarifies it for you! David From gbnewby at pglaf.org Mon Jan 17 16:39:39 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jan 17 16:39:41 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? In-Reply-To: <200501171753.47819.donovan@abs.net> References: <200501161710.39963.donovan@abs.net> <20050116222543.GE9811@pglaf.org> <200501171753.47819.donovan@abs.net> Message-ID: <20050118003939.GB10232@pglaf.org> On Mon, Jan 17, 2005 at 05:53:47PM -0500, D Garcia wrote: > On Sunday 16 January 2005 05:25 pm, Greg Newby wrote: > > On Sun, Jan 16, 2005 at 05:10:39PM -0500, D Garcia wrote: > > > It would be excellent to have a nightly-updated feed of appropriately > > > anonymized Cleared clearances published by PG in the same or similar > > > format as the Catalog feed. > > > > This is doable, but what do you want the records for? > > The same purpose as David's list, to look up what's In Progress (i.e. > "Cleared") so as to avoid duplication of effort. The advantage of PG doing it > automatically is *currency*. I know David does the best he can, but it's a > huge amount of work and is frequently a month behind. I know from Juliet that > recently there have been about 100 clearances being done per day, though > that's probably more than usual. Still though, at say 50 a day for a > month ... that's a significant lag time where different volunteers could each > be getting clearances for the same works. It's happened to me twice recently, > and several times before. Is this style enough, from two recent clearances: OK 20050104154143pergaud Le roman de Miraut, chien de chasse Louis Pergaud 1913:c OK 20050104152050malot En famille Hector Malot 1895:c OK, based on library stamp.--Juliet That's tab-delimited.... Something in XML or with field labels is also easy, though what's above is straight out of the log file the whitewashers use. -- Greg > > Everything is in a database now...though the older data are not > > divided into appropriate fields. > > That's good to know, and simplifies what I'm talking about below: > > > > I can forsee some difficulty in tying old clearances to current PG > > > usernames, though many should be mappable via email address. > > > > What do you need usernames for? Generally speaking, we try to keep > > clearance submitters' personal information "need to know," which many have > > requested. > > *I* don't need to know them. But anyone transforming the old gbn clearances > into the new style would need the email on that one to try to match it to the > current PG username. Simply an observation from a data manipulation > standpoint, and strictly backend. > > Hope that clarifies it for you! > David > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From nwolcott at dsdial.net Tue Jan 18 07:41:10 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Tue Jan 18 14:12:06 2005 Subject: [gutvol-d] Looking for text editor which does ... References: <616739250.20050115133712@noring.name> Message-ID: <000201c4fdaa$b37b81e0$759495ce@gw98> If you want just a text editor and not the whole GUIGUTS, you might try TEXTPAD. It does much of what you want--will reformat marked text to the current window (ok so you have to set it right to begin with), displays CR's as an option, will also join marked lines. It will not remove extra spaces between words, but will I think not hyphenate words. Has macro capability etc. Has been around since 1995. http://www.textpad.com/ .Also does a lot of other neat things convert from DOS , MAC etc. As far as regularizing, if plain text just save text document as html in Word. Extra spaces will be removed. Select entire text and paste back into a text document. Of course this does not do intelligent things like remove spaces before a period, etc. ----- Original Message ----- From: "Jon Noring" To: Sent: Saturday, January 15, 2005 3:37 PM Subject: [gutvol-d] Looking for text editor which does ... > Everyone, > > A basic question... > > I'd like to get the recommendations of the long-timers here for a > Windows-based GUI text editor or utility which cleans up *selected* > paragraphs of text (in plain text documents) to create uniform line > lengths with hard line breaks. > > The situation is that I have a large marked-up text document where > many paragraphs have varying and (many times) quite long line lengths. > For example, a paragraph may consist of three lines, the first may be > 250 characters long, the second 50 characters long, and the third 120 > characters long -- and I'd like to "regularize" the paragraph with > lines exactly 70 characters or less in length (this paragraph is an > example of such "regularization".) I'd like to simply select those > three lines in the utility, click a button or something, and the text > is automagically "regularized" (no hyphenation, one space between > words, etc.) > > It gets quite laborious doing this by hand with my text editor of > choice, vi (I use Lemmy, a Windows vi-clone, for most of my text > editing needs.) I do NOT want a tool which only globally does this to > the whole document (i.e. there are longer lines in the document which > I wish to keep unbroken.) And I do NOT want a tool requiring typing in > a long command line -- by the time I do that I could regularize the > paragraph by hand in my editor. I just want to select and regularize. > > > So, what's out there? Obviously Project Gutenbergers must use various > tools to "regularize" paragraphs. (There's no doubt a different word > most everyone here uses to describe this process, but I don't know > what it is, thus the use of the word "regularize".) > > Thanks. > > Jon Noring > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From nwolcott at dsdial.net Tue Jan 18 07:48:24 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Tue Jan 18 14:12:09 2005 Subject: [gutvol-d] DP crisis? Message-ID: <000301c4fdaa$b47ee820$759495ce@gw98> Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text. Is it appropriate to re-scan a book to start the process over again hoping for better luck? One could clear another edition, etc. N Wolcott nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050118/da326109/attachment.html From shalesller at writeme.com Tue Jan 18 14:14:16 2005 From: shalesller at writeme.com (D. Starner) Date: Tue Jan 18 14:14:59 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? Message-ID: <20050118221416.67D1F4BE6D@ws1-1.us4.outblaze.com> "Greg Newby" writes: > Is this style enough, from two recent clearances: > > OK 20050104154143pergaud Le roman de Miraut, chien de chasse Louis Pergaud 1913:c > > OK 20050104152050malot En famille Hector Malot 1895:c OK, based on library > stamp.--Juliet As one of the people who wants this, that would fine. It'd be nice to have all the authors names, though, especially translators. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From donovan at abs.net Tue Jan 18 14:58:47 2005 From: donovan at abs.net (D Garcia) Date: Tue Jan 18 14:59:39 2005 Subject: [gutvol-d] Nightly feed of clearance information--Possible? In-Reply-To: <20050118003939.GB10232@pglaf.org> References: <200501161710.39963.donovan@abs.net> <200501171753.47819.donovan@abs.net> <20050118003939.GB10232@pglaf.org> Message-ID: <200501181758.48184.donovan@abs.net> On Monday 17 January 2005 07:39 pm, Greg Newby wrote: > Is this style enough, from two recent clearances: > > OK 20050104154143pergaud Le roman de Miraut, chien de chasse > Louis Pergaud 1913:c > > OK 20050104152050malot En famille Hector Malot 1895:c OK, > based on library stamp.--Juliet > > That's tab-delimited.... Something in XML or with field > labels is also easy, though what's above is straight out of > the log file the whitewashers use. > -- Greg Works for me (though I don't think the comments are particularly relevant). The clearance date is obviously parseable, ditto title and author. When can you have it world-accessible? :) David From sly at victoria.tc.ca Tue Jan 18 15:10:18 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Jan 18 15:10:29 2005 Subject: [gutvol-d] DP crisis? In-Reply-To: <000301c4fdaa$b47ee820$759495ce@gw98> References: <000301c4fdaa$b47ee820$759495ce@gw98> Message-ID: On Tue, 18 Jan 2005, N Wolcott wrote: > Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text. > > Is it appropriate to re-scan a book to start the process over again hoping for better luck? > One could clear another edition, etc. > Perhaps "crisis" is too strong a word to use. I suspect this situation is somewhat inevitable, as playing a part in the proofing process, doing one page at a time is realatively easy to do, and gives a sense of having accomplishment sooner. Post-proofing is a larger commitment, and can be more tedious. On another note, one of the many texts waiting in the queue is "Alcyone", a collection of poetry by by Archibald Lampman, a highly-regarded Canadian poet. I have all the text of this volume which I have gathered from another online source, which I could use for comparison, and have offered multiple times to do post-proofing on this since June 2004, but have not had any responces, other than "contact the person the text has been assigned to" which I have tried multiple times with no response. Andrew From joshua at hutchinson.net Tue Jan 18 15:14:08 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jan 18 15:14:22 2005 Subject: [gutvol-d] DP crisis? In-Reply-To: <000301c4fdaa$b47ee820$759495ce@gw98> References: <000301c4fdaa$b47ee820$759495ce@gw98> Message-ID: <41ED9840.6070202@hutchinson.net> That is a bit of an exaggeration, but there are many, many texts in the post-processing stage at DP. Rescanning the book would only make it worse. Mostly we need people who are willing to work on post-processing texts. Long-term, we are actively working on new ways to handle much of the post-processing work. Currently, it is all done by one person. If things work the way we hope, much of the post-processing work will become distributed, too. Josh N Wolcott wrote: > Does DP have a post-processing crisis? With thousands of volunteers > texts flow regularly through the OCR and first phase quickly. However > there are several thousand books that have been in post processing > over a year. Many of these are hard, but many are plain text. > > Is it appropriate to re-scan a book to start the process over again > hoping for better luck? > One could clear another edition, etc. > > N Wolcott nwolcott2@post.harvard.edu > >------------------------------------------------------------------------ > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > From maitriv at yahoo.com Wed Jan 19 07:25:39 2005 From: maitriv at yahoo.com (maitri venkat-ramani) Date: Wed Jan 19 07:25:44 2005 Subject: [gutvol-d] Slashdot: eBooks In Germany In-Reply-To: <41ED9840.6070202@hutchinson.net> Message-ID: <20050119152540.7231.qmail@web52305.mail.yahoo.com> German Library Allowed To Crack Copy Protection Posted by timothy on Wednesday January 19, @04:03AM from the clashing-aims dept. AlexanderT writes "The EU Directive 2001/29/EU (also known as the European Copyright Directive) has made it "a criminal offence to break or attempt to break the copy protection or access control systems on digital content such as music, videos, eBooks, and software". Since today, at least in Germany there is one notable exception: The Deutsche Bibliothek, Germany's national library and bibliographic information center, has received a "license to copy", i.e. the official authorization to crack and duplicate DRM-protected e-books and other digital media such as CD-Audio and CD-Roms. The Deutsche Bibliothek achieved an agreement with the German Federation of the Phonographic Industry and the German Booksellers and Publishers Association after it became obvious that copy protections would not only annoy teenage school boys, but also prohibit the library from fulling its legal mandate to collect, process and bibliographic index important German and German-language based works." __________________________________ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail From bkeir at pgdp.net Wed Jan 19 12:58:58 2005 From: bkeir at pgdp.net (bkeir@pgdp.net) Date: Wed Jan 19 12:59:34 2005 Subject: [gutvol-d] DP crisis? In-Reply-To: <000301c4fdaa$b47ee820$759495ce@gw98> References: <000301c4fdaa$b47ee820$759495ce@gw98> Message-ID: <3121.61.8.63.4.1106168338.squirrel@61.8.63.4> > Is it appropriate to re-scan a book to start the process over again hoping > for better luck? Absolutely not. This does not help the situation in any way, and in fact contributes further to the perceived logjam. Every book currently in the PPing phase at DP *will* one day be posted to PG. Ebooks don't have a shelf-life, they will not go stale, there is no race. Given a book written, say, 90 (or 190) years ago, and especially given that PG aims to keep the ebook version available, when finished, for many hundreds of years, a book taking a year or two (or, yes, even five (though I don't know of any needing so long, yet)) to be digitised is chickenfeed. Cheers Bill From bruce at zuhause.org Wed Jan 19 23:55:40 2005 From: bruce at zuhause.org (Bruce Albrecht) Date: Wed Jan 19 23:56:00 2005 Subject: [gutvol-d] DP crisis? In-Reply-To: <000301c4fdaa$b47ee820$759495ce@gw98> References: <000301c4fdaa$b47ee820$759495ce@gw98> Message-ID: <16879.25596.360616.355515@celery.zuhause.org> N Wolcott writes: > Does DP have a post-processing crisis? With thousands of > volunteers texts flow regularly through the OCR and first phase > quickly. However there are several thousand books that have been in > post processing over a year. Many of these are hard, but many are > plain text. I know that other people have commented on this, but I'd just like to state that from looking at the PGDP statistics, I don't believe that this is even close to true. There are about 2250 projects that have completed the first two rounds of proofing but not posted to PG, 400 are waiting for a PPer, 1600 are in post-processing, and 250 are waiting for verification. None of the 400 projects waiting for a post-processor have been the queue for more than a year, although it is possible that some have been checked out and returned to the queue one or more times and could be older than a year. Of the 1600 that are checked out for post-processing, only about 40 have been in the queue more than a year, and about half have been checked out to the current PPer for 60 days or less. Again, it is possible that several PPers have checked out particular projects so that the statistics make them appear newer than they actually are. Finally, I'd like to point out that most PGDP projects are now generating more than one version of the text, HTML and text, and some of the delays can be due to PPers waiting to get better copies of images. Two of my four post-processing projects that had images that were adequate for a text-only project, but inadequate for HTML, and I had to go back to the content provider for better images, and I need to do some image processing before I am done with the HTML edition. > Is it appropriate to re-scan a book to start the process over again hoping for better luck? > One could clear another edition, etc. It seems as though you have some specific projects in mind which have not made it through the DP post-processing process. If they are waiting for a PPer, volunteer to do it yourself, or if the project(s) have been languishing in the PP queues, try to contact the PPer directly, and if that fails, try to contact one of the PGDP powers that be, to see if you can get the project reassigned to you. The powers that be at PGDP do try to get PPers to complete the project within 90 days, or to give it up if they're not actively working on it. In any case, unless the other edition differs from the one that's in the queue, you're better off trying to work within the system before trying to restart the process from scratch. From hart at pglaf.org Thu Jan 20 08:47:26 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 20 08:47:28 2005 Subject: [gutvol-d] Moving and Removing eBooks Message-ID: The Project Gutenberg Philosophy Concerning Orwellian History Rewriting While Encylcopedia Britannica has made it obvious on multiple occasions that they are embarrassed by references to their 11th edition, which is largely regarded as one of the seminal marks in refence materials, this attitude is not shared by others who would not approve of the Orwellian rewriting of history to make it appear as if Britannica had always been in possession of the facts it presents today and never had written from points of view that have now been become politically incorrect, or even discredited in more recent times. However, there are at least a dozen or two very outspoken volunteers at Project Gutenberg among a dozen or two thousand of such volunteers, who would prefer to delete many of the original Project Gutenberg eBooks in favor of replacing them with something else, as opposed to just working on them to bring them up to the standards of the modern era of eBooks. Shakespeare Compared To Britannica Shakepeare Donated by The World Library, The Earliest CD of eBooks 11 years ago Project Gutenberg received a donation of a "Complete Works of William Shakespeare," in their Folio format, which Project Gutenberg was then allowed to work with in plain text format to create their #100 eBook. . .a milestone of the day in which the only comparable eBook was #10, which contained both the Old and New Testaments of the Bible. The project of converting the World Library files took months, and the last night a dozen volunteers burned the midnight oil in various time zones, until, at last, just as we were running out of time zones, we completed Shakespeare's Complete Works, version 1.0, December 10, 1993. Official date was listed as January, 1994, as we were a bit ahead of schedule. Today Project Gutenberg has created about 150 times as many eBooks now, as we had then, though certainly only a few of them could rival eBooks containing the complete works of Shakespeare. Actually, several eBooks of various editions of Shakespeare have been added since then with each one having those who think it is the best of the bunch, not to mention, or only quietly, that we also included the Shakespearian apocrypha, and the Biblical apocrypha as well. While there really aren't any single volumes, no matter how large, that would increase today's eLibraries by the same comparable amount as that edition of Shakespeare did 11 years ago, something such as Britannica's 11th edition would be about as comparable as possible. Now. . .the question: Would someone be willing to do all the work to donate a Britannica 11th to Project Gutenberg this year if they thought it would be removed from Project Gutenberg a decade after it was first included? From hart at pglaf.org Thu Jan 20 08:48:03 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 20 08:48:04 2005 Subject: [gutvol-d] PT1 History of PG Message-ID: THE HISTORY OF PROJECT GUTENBERG It seems I have been remiss in keeping everyone up to date on the history of Project Gutenberg over the past few years, so I am taking this opportunity to write A Brief History of Project Gutenberg in several segments that will, hopefully, make amends for this lack on my part. PART ONE: THE FIRST 10 YEARS 1971-1980 In terms of actual page production, many people dismiss the opening decade of Project Gutenberg nearly completely. Now in terms of space, the published eTexts, as we called those at the time, will all fit on a modern floppy disk. Because of stringent storage allocations our eText files on the million dollar mainframe were just barely allowed. The struggle to put even these small files online was enormous, as it was a totally revolutionary idea to put up a file for a non-predetermined time period. This idea of something an entire future could download had never been brought up, and thus it was VERY hard to get permission to post even a file as small as the Declaration of Independence, because it was going to take up permanent space on the computer. Files of the following list were perhaps the first inkling of a kind of permanence the early Internet pioneers did not consider: Dec 1979 Abraham Lincoln's First Inaugural Address Dec 1978 Abraham Lincoln's Second Inaugural Address Dec 1977 The Mayflower Compact Dec 1976 Give Me Liberty Or Give Me Death, Patrick Henry Dec 1975 The United States Constitution Nov 1974 Gettysburg Address, Abraham Lincoln Nov 1973 John F. Kennedy's Inaugural Address Dec 1972 The United States Bill of Rights Dec 1971 The United States Declaration of Independence These first 9 files were collected into all7011.txt and all7011.zip for easy redistribution in upper and lower case in later years. The original files were all upper case, as there was no lower case on the early machines we were using at the time. You may see that we skipped two years between the US Bill of Rights and the US Constitution; we were originally going to try to include the complete Constitution in just a year after the Declaration, but we were told that would take too much space, and we were given just enough space for the Bill of Rights. The next year we asked again, but room was still very scarce, and so we asked again the next year and the year after. By then I was able to make convincing argument that waiting any longer might delay it so long that people wouldn't have access to it long enough before Bicentennial year of 1776, so we finally got room at the end of 1975. This may not sound very exciting to you from 30 years later, but it was VERY exciting to us, being able to put these files online for a whole country to use during the United States Bicentennial. We finished out the decade with more of those "Freedom Celebration" documents, as they were called, which were placed on the walls of a a variety of schools, malls, etc., during this period. During the period the greatest struggle was just to talk operators, even those that were very good friends, into giving us enough space to store anything but the smallest files. It was one thing to have $100,000,000 in "computer money" that could be used to run programs and send emails, but it was quite another thing to be granted space to store files that people from around the country would download. Here's just one early example: When I completed the Declaration of Independence, I wanted to email it to everyone on the Net [DARPANet, as we called it], but I found, to my great surprise, that if I had done this, even with such small files as the Declaration of Independence [5K], that it would create a complete network crash, since most of our wires were 113 baud, or 11 characters per second. Luckily, I asked for help in sending it, and avoided becoming quite well known as the first person to bring the Net to its knees; and a "Morris Worm" would have only been an asterisk, and so would I!!! In the end we simply posted a message to what later became comp.gen so people could get the file on request. My recollection is that 6 people downloaded it, other than the other four on our site, so the greatest penetration would have been about 10%. . .which sounds big by today's standards, but I had been hoping for more. A word about the computer operators of the day: we used to joke in many ways that the computer operators were the current priesthood-- you handed in your offering through the stainless steel window, and prayed that they would be worthy enough for the computer to run it. If you understand this, then perhaps you can also understand how it was the computer operators had so much power. Not only should they be considered as the entire force of computer security of that day, but they could also save you hours, if not days, of time by telling you just where and/or why your program wasn't running. I was quite seriously lucky that my brother's best friend was the operator from midnight to 8AM, when most of the free computer time was available, and that he gave me the account I used to start Project Gutenberg-- and as lucky that MY best friend became the 8AM-5PM operator. I should add that even at such an early date, I had help from those anonymous contributors who so often help. In this case I never was able to find out who typed in the first U.S. Constitution versions. I asked and asked, and even though there weren't that many persons, I never could find or thank the one who did it. That version was a print version in what served as a sort of markup of the day, so all I had to do was take out all the markup, backspace/underscores etc. to create a version that looked good onscreen. If anyone knows, it would still be nice to find out today, and send our thanks; for now I would just like to include a general thanks to all the volunteers who have helped Project Gutenberg over more than 1/3 century. Anyway, that's the story of the first decade of Project Gutenberg-- and I hope to work up something for the 1980's for next week. I should add here that even though the Apple II was out, I had none of the kind of money it would have taken to buy one, so my computer ownership starts not in this segment, but in the next one. Michael S. Hart Founder Project Gutenberg Postscript: For those interested in counting ye olde Project Gutenberg eBooks-- please note that there is no growth curve for this period; a growth graph would simply be a straight line, 1 title = 1 year, so it is a trivial point to say that at this growth rate it would take ~15,000 years to do ~15,000 titles, and that I would have been dead so long before we ever got to eBook #100 that no one would have remembered. Hence we do not talk about doubling rates for this period since the years required doubled at the same rate as the index entries did. Nevertheless, you will, from time to time, see people manipulate an army of statistics in such a way as to include these in patterns of growth, even though it is common knowledge that the earliest growth figures of any such pattern are quite linear. Just look at a curve of the population of the earth for a perfect example. Such curves, if studied in detail, yield a wealth of such growth information. Sample Moore's Law Projections Based on 1971 Here is an example of what would happen if Project Gutenberg growth projections were started using the 1 item we had in 1971: Start Finish Total Total ##### Year Year Years Doubles x2^y = Grand Total in Year #1 1971 2001 30 20 1*2^20 = 1,048,576 in 2001 #1 1971 2004 33 22 1*2^22 = 4,194,304 in 2004 Obviously no one ever seriously considered that Project Gutenberg might actually release a million eBooks in 2001, but there were a few examples recently of suggestions that we should have used the 1971 date, and thus the resultant figures listed above when doing our Moore's Law predictions. I trust at least this specific example has now been put to rest. From jon at noring.name Thu Jan 20 10:34:58 2005 From: jon at noring.name (Jon Noring) Date: Thu Jan 20 10:35:43 2005 Subject: [gutvol-d] Moving and Removing eBooks In-Reply-To: References: Message-ID: <14211702703.20050120113458@noring.name> Michael Hart wrote: > The Project Gutenberg Philosophy Concerning Orwellian History Rewriting > > However, there are at least a dozen or two very outspoken volunteers at > Project Gutenberg among a dozen or two thousand of such volunteers, who > would prefer to delete many of the original Project Gutenberg eBooks in > favor of replacing them with something else, as opposed to just working > on them to bring them up to the standards of the modern era of eBooks. Who says the original PG texts (many of which *need* to be redone from scratch [note]) will disappear? Don't you keep the prior versions of the same Work in the archive? [Note: many of the pre-DP texts need to be redone from scratch for various reasons. For another project, I'm now working on My ?ntonia by Willa Cather, one of the early PG releases (#242), and the latest PG edition of it, #11!, is horribly mangled from various edits, without recourse to the original, during its lifetime. (In addition, the PG version apparently used the very buggy English release as one source -- gak!) This emendment process over many editions without recourse to the original is like the party trick of sharing a bit of information in a chain from person to person; by the tenth person the meaning of the information has so changed that it no longer conforms to the original!) I recently bought the original 1918 edition (fourth printing I believe) of My ?ntonia and am now scanning it. But in the meanwhile I'm mostly done producing an entirely *faithful* (content-wise) draft XHTML version of this book, faithful to the content of the 1st Edition in every detail (no doubt a few small errors persist, but I know they are few and far between.) I will gladly donate the finished XHTML 1.1 version to PG if the associated page scans, which are linked from the XHTML, will be included in the archive, and the full source citation is kept *intact* in its entirety in the marked-up text and in the boilerplate metadata. For those interested, a temporarily and awful-CSS-styled version of the draft can be seen at: http://www.openreader.org/myantonia/myantonia.html (includes page scan links) http://www.openreader.org/myantonia/myantonia-np.html (sans page scan links) (Only the first several page scans are available online at this time as low-rez JPGs -- the originals are full-color 600 dpi (optical). Critical feedback on the underlying markup is more than welcome. If the XHTML+scans won't fit into the work flow of DP, which I don't believe they will, I'll soon ask for volunteers to finalize the XHTML version by comparing it to the page scans which will all be placed online and linked from the text, and to email me any found errors for final fixing -- a sort of DP-like process since it can be done page-by-page. Any volunteers?) > Now. . .the question: > > Would someone be willing to do all the work to donate a Britannica 11th > to Project Gutenberg this year if they thought it would be removed from > Project Gutenberg a decade after it was first included? (Again, why *remove* what has already been submitted?) Michael, I considered over a decade ago in actively volunteering for PG but decided against it because PG was not focusing on doing things *right*, IMHO. For starters, PG was amiss in: 1) Not including full source information in the texts. 2) Not making faithful reproductions of the sources -- too much leeway was given to emendments and to merging different editions, at least without a vetting process to assure there were no bad emendments or surreptitious changes (now that's rewriting history!) As it stands now, I have no faith that the early texts are faithful reproductions of the original print versions or that some have been surreptitiously changed -- and no tracking of the emendments were ever recorded -- that's why I rarely use the PG texts, other than DP releases which I have a lot more faith in. (I also have the Frankenstein "monster" debacle which I've shared here in the past.) 3) Converting non-ASCII characters to ASCII equivalents (e.g., removing accents from characters.) Proper reproduction of the original characters used is *critical* to preserve. Any PG text which "ASCII-ized" all characters is automatically broken and must be replaced with a remake from, or by reference to, an original source copy. (Today I'd add a fourth requirement: retain page scans for all new works, and no longer accept works which don't have page scans to go along with the texts to 1) verify authenticity, 2) to provide guidance for those who plan to use the texts, such as for presentational purposes, and 3) to help properly fix any claimed errors. Internet Archive will gladly archive the page scans if PG's servers don't have the space and bandwidth to handle the page scans.) I'm not alone in this sentiment, Michael. I talk to others who did *not* volunteer for PG because of the clearly wrong policies which PG early-on established (and "not establishing policies" is a defacto policy.) One must not only count the volunteers, one must also count the non-volunteers who considered volunteering. To answer your question, any book would not be replaced *if* it were processed *right* in the first place. We know enough today as to what is necessary to properly make digital text versions of books, and by and large DP is following best practice. Consider the early years to be experimental. (Most engineers will tell you that the first and even second versions of anything are "learning" -- you learn from them, and then throw them away. Stable design is not usually reached until at least the third version of anything.) (You also ask how people would feel if their work would be "thrown away" after a decade. Well, how do people feel when their work is mangled in subsequent PG editions by new emendments of others, such as what appeared to happen to My ?ntonia, which, as I noted above, is so terribly mangled that it must be replaced?) Many people have enjoyed the texts which PG has produced, buggy as many of the early ones are, so it's not as if the early work PG produced was wasted. It was not. Just like anything in the world, there is a life-span to the texts. One doesn't look back but one looks ahead to the future. I see redoing the early corpus of PG texts to be a great opportunity, and not something to be avoided. Properly done, this redo project will produce texts which should have a very long shelf life, if not indefinite. DP should take the lead in this effort to redo the early PG classics, since these are the most popular books in the PG corpus. Jon From Gutenberg9443 at aol.com Thu Jan 20 11:32:32 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Thu Jan 20 11:32:44 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs. anything else Message-ID: <1a5.2f2197b2.2f216150@aol.com> I just got an extremely distressed snailmail letter from a retired professor of Oriental and Buddhist studies, who is blind and lives in Thailand. I quote him: "Now as far as I can tell there are no more e-books available in "txt" format but only in "HTM" format. I am blind as I stated at the outset. I use an Apple computer with a screen reading programme called "OutSpoken" which conflicts with many document formats. It also conflicts with many common programmes, but that's another story which has nothing to do with you. "My problem with the Project Gutenberg Archive in its present form is that when I download a book, the file on my computer includes many rubbish characters which make it virtually impossible to read." <> "So in conclusion I thank you for services rendered me in the past and regret that they are no longer available to me because of the remarkable technological advances that you are incorporating into your archive. Yours, frustrated and regretful, Dr. Peter Della SAntina." I have told him that most books still are in .txt as well as other things, and that when they aren't, the problem usually is that they are prohibitively long and/or contain characters which we cannot use to post in .txt. I then told him that whenever he runs into this problem, he is to email me and I personally will send him a copy of the book in .txt; if I don't get it to him within 3 days, he is to assume I'm ill and send the request to Aaron Cannon. I know what he's talking about because I downloaded a copy of Kipling's story "The Brushwood Boy" a couple of weeks ago in .htm form and found a lot of rubbish characters in it, but I just read around them despite feeling rather exasperated. I had been seized with an acute wish to read the story at three AM, and after getting up, booting my computer etc., downloading it, putting it on my ebook reader, shutting everything down again, and going back to bed, my desire to redownload it at that time was nonexistent. Anyway, that could not possibly have been a recent post, though for all I know it might be a recent REpost. Most .htm files work just fine on that ebook reader. I told Dr. Santina that I appreciated his letting me know, and asked him to notify us if he ran into such problems in the future. I think this is a strong argument for continuing Michael's practice of posting everything in English in .txt AND whatever else rather than in whatever else by itself. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050120/dfb6c704/attachment-0001.html From marcello at perathoner.de Thu Jan 20 10:42:43 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jan 20 12:27:37 2005 Subject: [gutvol-d] PT1 History of PG In-Reply-To: References: Message-ID: <41EFFBA3.8040503@perathoner.de> Michael Hart wrote: > Obviously no one ever seriously considered that Project Gutenberg > might actually release a million eBooks in 2001, but there were a > few examples recently of suggestions that we should have used the > 1971 date, and thus the resultant figures listed above when doing > our Moore's Law predictions. [epighraph] "Contrary to popular claims, it appears that the common versions of Moore's Law have not been valid during the last decades. As semiconductors are becoming important in economy and society, Moore's Law is now becoming an increasingly misleading predictor of future developments." ... "Indeed, sociologically Moore's Law is a fascinating case of how myths are manufactured in the modern society and how such myths rapidly propagate into scientific articles, speeches of leading industrialists, and government policy reports around the world." http://firstmonday.org/issues/issue7_11/tuomi/index.html [/epigraph] Read that page for the sad truth about Moore's "Law". My suggestion was to stop using arbitrary data to keep up the illusion of Moore's Law (which, if you had read that page, would have known never worked even for computers) but to use real data to show that Moore's "Law" does not fit to PG production. The suggestion was to use the real date the project started (1971) instead of your fictitious and arbitrary one (1990). Of course, using real dates, the idea that PG production followed Moore's Law dies a horrible death. Even looking at the relatively short period of Nov 2003 to Nov 2004 we can prove in a very simple manner that Moore's Law doesn't hold. 1. In Nov 2003 we had 10000 books. 2. Applying Moore's Law, in Nov 2004 we should have had 10000 * 2 ^ (12/18) = 15874 books. 3. Moore's Law does not hold for PG archive size. QED The sad fact is: some people with a marketing person's mind prefer to stick to a phony and proven wrong "Law" because it is such a slick formulation. Why do we need flashy formulas at all? If we say: "we got 15000 books today" isn't that enough? Don't you have faith in the facts? -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Thu Jan 20 11:47:09 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jan 20 12:27:46 2005 Subject: [gutvol-d] Michael and the 12 Thousand Apostles In-Reply-To: References: Message-ID: <41F00ABD.1050804@perathoner.de> Michael Hart wrote: > However, there are at least a dozen or two very outspoken volunteers at > Project Gutenberg among a dozen or two thousand of such volunteers, who > would prefer to delete many of the original Project Gutenberg eBooks in > favor of replacing them with something else, as opposed to just working > on them to bring them up to the standards of the modern era of eBooks. This is a deliberate mis-statement of the facts. Glossary: "A dozen or two very outspoken volunteers": those who spoke up against Michael. "among 12 thousand": the rhetoric of the silent majorities is an instrument widely used in propaganda. The speaker stipulates the existence of a fictitious silent majority who are in favour of his ideas. This didn't work when used against the peace movement in the 80s and doesn't work now. The facts set right: "A dozen or two very outspoken volunteers" were contemplating the question if it was advisable to keep some files in the catalog database which cannot be read any more because the file format is proprietary and we could not get a copy of the reader program to distribute with the files. The question came up because a reader mistakenly downloaded those files for genuine ones and was asking us for the reader program, which of course we couldn't supply. Nobody was advocating to delete the files. Some people advocated writing a "proprietary file formats hall of shame" page using appropriate language and linking to the files from there as examples. This would have made the files more visible than they are now. Makes you wonder: which one of these proposals made Michael use the phrase "Orwellian Rewriting of History"? > Would someone be willing to do all the work to donate a Britannica 11th > to Project Gutenberg this year if they thought it would be removed from > Project Gutenberg a decade after it was first included? This rhetoric question is based on mis-stated facts. Simple answer: If somebody was to do the Britannica now, she would simply include a plain text version -- which, we know, will last forever. -- Marcello Perathoner webmaster@gutenberg.org From jmdyck at ibiblio.org Thu Jan 20 12:27:47 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Thu Jan 20 12:29:39 2005 Subject: [gutvol-d] Moving and Removing eBooks References: Message-ID: <41F01443.8A0BAEAD@ibiblio.org> Michael Hart wrote: > > However, there are at least a dozen or two very outspoken volunteers at > Project Gutenberg among a dozen or two thousand of such volunteers, who > would prefer to delete many of the original Project Gutenberg eBooks in > favor of replacing them with something else, as opposed to just working > on them to bring them up to the standards of the modern era of eBooks. This sounds like an exaggeration to me. It's true that (on January 4th) D. Starner asked[1] if we could "get rid of" PG's World Library Editions of Shakespeare, and appeared to be in favour of doing so. See: http://lists.pglaf.org/private.cgi/gutvol-d/2005-January/001133.html However, (a) I don't see that anyone agreed with the deletion. So that's one outspoken volunteer, not "at least a dozen or two". (b) I don't see anyone recommending the deletion of any other books. If there are posts I've missed that would support the assertion above, feel free to give links to the archive. Mind you, that's assuming that the outspokenness has happened on gutvol-d. Did it happen somewhere else? By the way, I'm curious as to why D. Starner would like to get rid of those editions. Are they particularly bad/questionable for some reason? > Would someone be willing to do all the work to donate a Britannica 11th > to Project Gutenberg this year if they thought it would be removed from > Project Gutenberg a decade after it was first included? Yes, I would. And if you ask "Why would you go to all that effort, only to have the results deleted in 10 years?", I would point out that it would only be PG's copy that's deleted; the results of my work would live on, elsewhere on the web. -Michael From joshua at hutchinson.net Thu Jan 20 13:01:44 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 20 13:01:54 2005 Subject: [gutvol-d] Moving and Removing eBooks Message-ID: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> I do remember a discussion last month about a file format that is completely unaccesible (the reader no longer exists). There is one or two ebooks in the PG collection in this format. There was call to deprecate those versions into an OLD subfolder (or something like that) so that people coming to the site weren't confused. No one advocated deletion, only better cataloging. That is probably where Michael's dozen is coming from, because many people thought this was a sensible step to take (and basically only Michael thought it wasn't). Josh ----- Original Message ----- From: "Michael Dyck" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] Moving and Removing eBooks Date: Thu, 20 Jan 2005 12:27:47 -0800 > > Michael Hart wrote: > > > > However, there are at least a dozen or two very outspoken volunteers at > > Project Gutenberg among a dozen or two thousand of such volunteers, who > > would prefer to delete many of the original Project Gutenberg eBooks in > > favor of replacing them with something else, as opposed to just working > > on them to bring them up to the standards of the modern era of eBooks. > > This sounds like an exaggeration to me. It's true that (on January 4th) > D. Starner asked[1] if we could "get rid of" PG's World Library Editions > of Shakespeare, and appeared to be in favour of doing so. See: > http://lists.pglaf.org/private.cgi/gutvol-d/2005-January/001133.html > However, > > (a) I don't see that anyone agreed with the deletion. So that's one > outspoken volunteer, not "at least a dozen or two". > > (b) I don't see anyone recommending the deletion of any other books. > > If there are posts I've missed that would support the assertion above, > feel free to give links to the archive. > > Mind you, that's assuming that the outspokenness has happened on > gutvol-d. Did it happen somewhere else? > > By the way, I'm curious as to why D. Starner would like to get rid of > those editions. Are they particularly bad/questionable for some reason? > > > > Would someone be willing to do all the work to donate a Britannica 11th > > to Project Gutenberg this year if they thought it would be removed from > > Project Gutenberg a decade after it was first included? > > Yes, I would. And if you ask "Why would you go to all that effort, only > to have the results deleted in 10 years?", I would point out that it > would only be PG's copy that's deleted; the results of my work would > live on, elsewhere on the web. > > -Michael > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Thu Jan 20 13:07:24 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jan 20 13:07:33 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs.anything else Message-ID: <20050120210724.A15E22FA42@ws6-3.us4.outblaze.com> ----- Original Message ----- From: Gutenberg9443@aol.com > > I think this is a strong argument for continuing Michael's practice of > posting everything in English in .txt AND whatever else rather than in whatever > else by itself. > No one has advocated doing away with .txt as a posted format. The closest I can recall is a suggestion to do away with 7-bit ASCII text files. Deleted accents never made much sense to me anyway, so I certainly wouldn't cry over the loss of 7-bit ASCII. The only books that should be unavailable in text are ones that physically CAN'T be done in text, such as the complex mathematical works that requires TeX and PDF work. If anyone finds a book that does not have a text version posted should contact the bugs email address at PG to get that fixed ASAP. Josh From jonhendry at mac.com Thu Jan 20 13:29:25 2005 From: jonhendry at mac.com (Jonathan Hendry) Date: Thu Jan 20 13:29:40 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs. anything else In-Reply-To: <1a5.2f2197b2.2f216150@aol.com> References: <1a5.2f2197b2.2f216150@aol.com> Message-ID: <5B88E7D4-6B2A-11D9-AD45-000A956D5546@mac.com> On Jan 20, 2005, at 2:32 PM, Gutenberg9443@aol.com wrote: > ? > I know what he's talking about because I downloaded a copy of > Kipling's story "The Brushwood Boy" a couple of weeks ago in .htm form > and found a lot of rubbish characters in it, but I just read around > them despite feeling rather exasperated. I had been seized with an > acute wish to read the story at three AM, and after getting up, > booting my computer etc., downloading it, putting it on my ebook > reader,?shutting everything down again, and going back to bed, my > desire to redownload it at that time was nonexistent. Anyway, that > could not possibly have been a recent post, though for all I know it > might be a recent REpost. Most .htm files work just fine on that ebook > reader. It sounds like the file's bytes are being interpreted as the wrong text encoding. If I'm not mistaken, this is a problem with 8-bit ASCII, because there are various ways of using the upper 128 bits to represent characters. This especially is a problem with accented characters. Your correspondent may be using a program which assumes a different text encoding than is used in the PG files he has been opening. It may be assuming "Mac OS" encoding, when the text is in something else. This can also afflict text in an HTML file. If you go to a page in a foreign language, and the characters are not represented correctly even though you have a compatible font, it's probably the text encoding. On the Mac, the Safari browser has a submenu (View->Text Encoding) which allows you to select from a variety of text encodings. Find the right one, and the text will appear correct, without wrong characters or missing glyphs. This may be the problem with your ebook reader. Does it allow you to select the encoding used to interpret the html? Your friend in Thailand might also want to check for such a feature in his software. - Jon Hendry From jtinsley at pobox.com Thu Jan 20 15:17:58 2005 From: jtinsley at pobox.com (Jim Tinsley) Date: Thu Jan 20 15:18:10 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs. anything else In-Reply-To: <1a5.2f2197b2.2f216150@aol.com> References: <1a5.2f2197b2.2f216150@aol.com> Message-ID: <20050120231758.GA6406@panix.com> On Thu, Jan 20, 2005 at 02:32:32PM -0500, Gutenberg9443@aol.com wrote: >I just got an extremely distressed snailmail letter from a retired professor >of Oriental and Buddhist studies, who is blind and lives in Thailand. > >I quote him: > >"Now as far as I can tell there are no more e-books available in "txt" >format but only in "HTM" format. This seems very odd to me. I just ran a check, and I can find exactly 59 etext numbers for which there is HTML but no equivalent text file. Most of these are collections of images. A few are cases where the HTML was posted as a different number from the existing .txt. We don't do that any more, but there are some old cases. One was a bad upload, which I'm hunting around for a fix for now. Seriously, are you sure he's actually looking at _our_ site, as opposed to some other site like Blackmask? > >I know what he's talking about because I downloaded a copy of Kipling's >story "The Brushwood Boy" a couple of weeks ago in .htm form And this is why I'm asking. The only copy of this title I can find in PG is in "The Day's Work" collection, in file dyswk10.txt, which is plain text. I don't see how you could have downloaded this from PG in HTML format. jim From shalesller at writeme.com Thu Jan 20 21:04:31 2005 From: shalesller at writeme.com (D. Starner) Date: Thu Jan 20 21:04:48 2005 Subject: [gutvol-d] Moving and Removing eBooks Message-ID: <20050121050431.B0B9B4BDAA@ws1-1.us4.outblaze.com> "Michael Dyck" writes: > By the way, I'm curious as to why D. Starner would like to get rid of > those editions. Are they particularly bad/questionable for some reason? They're copyrighted, and PG has generally discouraged, for good reason, copyrighted editions of public domain works. From what Michael Hart said, I got the impression that it was an early 1930s edition; hence it does the other thing that annoys PGers to no end, adding a new copyright to material with no new copyrightable material. And if the edition really is so much better than the public domain ones, there should be something on the files talking about the edition, instead of it just looking like someone else digitalized a public domain edition and decided to give PG a copyright with a new copyright on it. > > Would someone be willing to do all the work to donate a Britannica 11th > > to Project Gutenberg this year if they thought it would be removed from > > Project Gutenberg a decade after it was first included? > > Yes, I would. And if you ask "Why would you go to all that effort, only > to have the results deleted in 10 years?", I would point out that it > would only be PG's copy that's deleted; the results of my work would > live on, elsewhere on the web. /* 'Look on my works, ye mighty, and despair!' Nothing beside remains: round the decay Of that colossal wreck, boundless and bare, The lone and level sands stretch far away. */ Any person's actions are ultimately ephemeral. Most effort in the real world goes to the completely ephemeral; most of the rest, blogs and garage bands and writers trying to create the Great American Novel, will be disappear not long after it was created without even creating a ripple on the sea of life. Kubla Khan, the work of a couple hours spent as high as a kite, is interesting; the poetry of most college students, even stuff labored over for months, isn't. It doesn't matter how much work went into it; what matters is useful results. Speaking personally, I've got scans of several books on my hard drive that turned out to already be in progress. I shrug my shoulders and go on; that some part of my work will turn out to be for naught is inevitable. In the process of producing a German translation of Alice in Wonderland, I have scanned new versions of the Tenniel illustrations, as the previous ones are too small to capture the detail, and we can afford larger file sizes nowadays. If someone in ten years decides to produce the definitive version of the Tenniel illustrations and push aside my edition, then so be it; I certainly won't take it personally. If something of mine has to be redone or deleted, then it has to be redone or deleted. I certainly don't advocate the widescale wasting of work that's already done, but I don't agree with the argument that we should keep something of no value because someone put work into. But I certainly seem to be a minority voice here. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From marcello at perathoner.de Thu Jan 20 12:46:29 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Fri Jan 21 10:09:07 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs. anything else In-Reply-To: <1a5.2f2197b2.2f216150@aol.com> References: <1a5.2f2197b2.2f216150@aol.com> Message-ID: <41F018A5.2030005@perathoner.de> Gutenberg9443@aol.com wrote: > "Now as far as I can tell there are no more e-books available in "txt" > format but only in "HTM" format. I am blind as I stated at the outset. I use an > Apple computer with a screen reading programme called "OutSpoken" which > conflicts with many document formats. It also conflicts with many common programmes, > but that's another story which has nothing to do with you. > > "My problem with the Project Gutenberg Archive in its present form is that > when I download a book, the file on my computer includes many rubbish > characters which make it virtually impossible to read." He should have told what model Apple computer he is using and what charcter encoding(s) his program expects. The online recoding service offers recoding into "Apple MacIntosh" character set. -- Marcello Perathoner webmaster@gutenberg.org From hart at pglaf.org Fri Jan 21 10:40:23 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 21 10:40:25 2005 Subject: [gutvol-d] letter from blind PG user in Thailand RE: .txt vs. anything else In-Reply-To: <20050120231758.GA6406@panix.com> References: <1a5.2f2197b2.2f216150@aol.com> <20050120231758.GA6406@panix.com> Message-ID: PG often gets credit, or discredit, as the case may be, for all sorts of eBooks all over the world that we had nothing to do with, other than starting the eBook idea. Michael From hart at pglaf.org Fri Jan 21 11:01:03 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 21 11:01:05 2005 Subject: [gutvol-d] Moving and Removing eBooks In-Reply-To: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> Message-ID: On Thu, 20 Jan 2005, Joshua Hutchinson wrote: > I do remember a discussion last month about a file format that is completely > unaccesible (the reader no longer exists). There is one or two ebooks in the > PG collection in this format. There was call to deprecate those versions > into an OLD subfolder (or something like that) so that people coming to the > site weren't confused. Everyone refers to their own suggestions as "better". . .as "reform," etc. > No one advocated deletion, only better cataloging. This is why the suject says both "Moving" and "Removing". . .the removing refers to suggestions for other than the Folio file mentioned previously, including eBook #100, the Complete Works of Shakespeare. As for "only better cataloging". . .the obvious thing is simply to point plainly to both versions, with a note that the Folio format requires a proprietary reader. > That is probably where Michael's dozen is coming from, because many people > thought this was a sensible step to take (and basically only Michael thought > it wasn't). Actually, the dozen comes from various discussions we've had over time. Some of us try to remember the past as we plan for the future. > > Josh > > > ----- Original Message ----- From: "Michael Dyck" To: [snip] Michael From ajhaines at shaw.ca Fri Jan 21 11:09:08 2005 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Fri Jan 21 11:09:17 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> In November 2004 there were several threads dealing with marking bold text in text files ("Marking bold & italic in .txt"). One of those messages, dated Nov 12, 2004, indicated that the PG FAQ's would be updated to indicate the use of asterisks (*) to mark bold text, similar to PG's FAQ V.94's standard of using underscores to indicate italicized text. Was this update ever done? FYI - I checked the Distributed Proofing site's FAQ's, but the only reference I can find of bold text says to use the HTML and tags. In the absence of a PG standard, is it OK for me to use asterisks to indicate bold text within a document? (Note that I'd only be doing this for bold text inside paragraphs, not for things like headings that might be bolded as part of some automatic formatting process.) Al -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050121/bd33eaad/attachment.html From joshua at hutchinson.net Fri Jan 21 11:36:35 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 11:36:43 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <20050121193635.3DF662F8FB@ws6-3.us4.outblaze.com> ----- Original Message ----- From: "Al Haines (shaw)" > > FYI - I checked the Distributed Proofing site's FAQ's, but the only reference > I can find of bold text says to use the HTML and tags. > Just a quick elaboration... The DP faq is talking about markup in the proofing rounds, not the final product to PG. We do a lot of things in the proofing rounds that get changed (or in some encodings, deleted) in DP proofing rounds. As far as I know, * to indicate bold is acceptable in the text versions. Josh From hart at pglaf.org Fri Jan 21 11:36:52 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 21 11:36:54 2005 Subject: !@! Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <41F00ABD.1050804@perathoner.de> References: <41F00ABD.1050804@perathoner.de> Message-ID: On Thu, 20 Jan 2005, Marcello Perathoner wrote: > Michael Hart wrote: > >> However, there are at least a dozen or two very outspoken volunteers at >> Project Gutenberg among a dozen or two thousand of such volunteers, who >> would prefer to delete many of the original Project Gutenberg eBooks in >> favor of replacing them with something else, as opposed to just working >> on them to bring them up to the standards of the modern era of eBooks. > > This is a deliberate mis-statement of the facts. > > Glossary: "A dozen or two very outspoken volunteers": those who spoke up > against Michael. "among 12 thousand": the rhetoric of the silent majorities > is an instrument widely used in propaganda. The speaker stipulates the > existence of a fictitious silent majority who are in favour of his ideas. > This didn't work when used against the peace movement in the 80s and doesn't > work now. I presume you can see through the fallacy of this misstatement as easily as you could see through the fallacy of starting Moore's Law projections from having 1 file in 1971. Some people remember these fallacious notes, not only from the most recent week or month. I tried to engage some of the authors of these fallacies offline, but it becomes obvious they only want to speak in front of a large crowd, and not to actually solve or resolve the situtation, or to stand or understand, concerning the actual questions at hand. To use the words misused above in a proper context, I have never mentioned any kind of "silent majority," much less "stipulated the existence of a ficticious silent majority," nor is there any need for anyone to do so, because virtually anyone can do as they please in Project Gutenberg. There can no tyranny of the majority, nor of a very vocal minority, as has been previously stated quite clearly by the various FAQs, Mission Statements, etc. . . . There is no need for any ruling by any portion of Project Gutenberg simply because we give the OK for virtually every project proposed. As for those who would like to rule OUT what others have done, "Moving and Removing" various efforts from our past history, that is not very likely. If you want to suggestion removing something, why not look at some of the new Science Fiction that was requested? Those were posted purely as an experiment, and only recently, with minimal effort by Project Gutenberg personnel. . . . As for picking on items that have been in our collection for over a decade, you are a little late, and history will not be rewritten at such requests, nor will it be swept under a rug. "Those who do not study history are condemned to repeat it." For those who have not studied the history of Project Gutenberg, this is not the first time we have had very vocal suggestions to change history, change direction, or any of the other suggestions made by a very vocal 1/1000th of our volunteers that would change Project Gutenberg into their own private fiefdom. When these people try to take charge, which is every 5 years or so, the answer is inevitably, "You can do virtually anything you like in your own portion of Project Gutenberg, but you can't tell others they cannot do virtually anything they like in their portion of PG." Project Gutenberg has what is generally known as an "Open Door Policy." This means that virtually anyone and everyone are welcome, and their contribution will be used as best we can manage, even if, as has been the case throughout our history, when items might not be quite proper for Project Gutenberg, so we pass them on to other eBook operations, as others have also done in our direction. Our Mission is to: "ENCOURAGE THE CREATION AND DISTRIBUTION OF eBOOKS" and "BREAK DOWN THE BARS OF IGNORANCE AND ILLITERACY" Rather than imposing our will on everyone, we prefer to: "LEAD BY EXAMPLE" To lead by example the examples have to be there, in full view of the world, so people can see what has been tried in the past, how it works presently, and hopefully figure out ways things will work in the future. This is why we do not "Move or Remove" eBooks from the more visble locations to the less visible. Michael From krooger at debian.org Fri Jan 21 11:37:13 2005 From: krooger at debian.org (Jonathan Walther) Date: Fri Jan 21 11:37:25 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> References: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> Message-ID: <20050121193713.GA5229@reactor-core.org> On Fri, Jan 21, 2005 at 11:09:08AM -0800, Al Haines (shaw) wrote: > FYI - I checked the Distributed Proofing site's FAQ's, but the > only reference I can find of bold text says to use the HTML and > tags. Personally I would prefer that italics also be done as and ; then it would be easy to strip them out with software. The _ character usually indicates a mistake in the conversion process. Jonathan -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! Eukleia: Jonathan Walther Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) Contact: 604-582-9308 (between 7am and 11pm, PST) Website: http://reactor-core.org/ Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery It's not true unless it makes you laugh, but you don't understand it until it makes you weep. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: Digital signature Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050121/20415877/attachment.bin From krooger at debian.org Fri Jan 21 11:38:13 2005 From: krooger at debian.org (Jonathan Walther) Date: Fri Jan 21 11:38:22 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <20050121193635.3DF662F8FB@ws6-3.us4.outblaze.com> References: <20050121193635.3DF662F8FB@ws6-3.us4.outblaze.com> Message-ID: <20050121193813.GB5229@reactor-core.org> On Fri, Jan 21, 2005 at 02:36:35PM -0500, Joshua Hutchinson wrote: >As far as I know, * to indicate bold is acceptable in the text >versions. Sometimes * indicates a footnote; how do you distinguish using automated tools? Jonathan -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! Eukleia: Jonathan Walther Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) Contact: 604-582-9308 (between 7am and 11pm, PST) Website: http://reactor-core.org/ Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery It's not true unless it makes you laugh, but you don't understand it until it makes you weep. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: Digital signature Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050121/5d72a6cd/attachment.bin From hart at pglaf.org Fri Jan 21 11:39:15 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 21 11:39:17 2005 Subject: [gutvol-d] PT1 History of PG In-Reply-To: <41EFFBA3.8040503@perathoner.de> References: <41EFFBA3.8040503@perathoner.de> Message-ID: Just a simple question. . .how many people believe any of this? Should I really go through the motions of refuting it again, and again, and again? As I said privately, offline, I don't think even the speaker believes what he is saying. . . . Michael On Thu, 20 Jan 2005, Marcello Perathoner wrote: > Michael Hart wrote: > >> Obviously no one ever seriously considered that Project Gutenberg >> might actually release a million eBooks in 2001, but there were a >> few examples recently of suggestions that we should have used the >> 1971 date, and thus the resultant figures listed above when doing >> our Moore's Law predictions. > > [epighraph] > > "Contrary to popular claims, it appears that the common versions of > Moore's Law have not been valid during the last decades. As > semiconductors are becoming important in economy and society, Moore's > Law is now becoming an increasingly misleading predictor of future > developments." > > ... > > "Indeed, sociologically Moore's Law is a fascinating case of how myths are > manufactured in the modern society and how such myths rapidly propagate into > scientific articles, speeches of leading industrialists, and government > policy reports around the world." > > http://firstmonday.org/issues/issue7_11/tuomi/index.html > > [/epigraph] > > Read that page for the sad truth about Moore's "Law". > > > My suggestion was to stop using arbitrary data to keep up the illusion > of Moore's Law (which, if you had read that page, would have known never > worked even for computers) but to use real data to show that Moore's > "Law" does not fit to PG production. > > The suggestion was to use the real date the project started (1971) > instead of your fictitious and arbitrary one (1990). > > Of course, using real dates, the idea that PG production followed > Moore's Law dies a horrible death. > > > Even looking at the relatively short period of Nov 2003 to Nov 2004 we > can prove in a very simple manner that Moore's Law doesn't hold. > > 1. In Nov 2003 we had 10000 books. > > 2. Applying Moore's Law, in Nov 2004 we should have had > 10000 * 2 ^ (12/18) = 15874 books. > > 3. Moore's Law does not hold for PG archive size. > > QED > > > The sad fact is: some people with a marketing person's mind prefer to > stick to a phony and proven wrong "Law" because it is such a slick > formulation. > > Why do we need flashy formulas at all? If we say: "we got 15000 books > today" isn't that enough? Don't you have faith in the facts? > > > -- > Marcello Perathoner > webmaster@gutenberg.org > > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From joshua at hutchinson.net Fri Jan 21 11:45:21 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 11:45:28 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <20050121194521.579C44F51D@ws6-5.us4.outblaze.com> * should be used to indicate a footnote in a PG text (though there may be OLD examples of doing such). For instance, all DP-generated text's footnotes would mark the * as [A]. A numbered footnote would look like this [1]. Chances are, if any old texts use * as a footnote marker, when they get updated, that will probably be updated as well. Josh ----- Original Message ----- From: "Jonathan Walther" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] Marking Bold in text files Date: Fri, 21 Jan 2005 11:38:13 -0800 > > On Fri, Jan 21, 2005 at 02:36:35PM -0500, Joshua Hutchinson wrote: > > As far as I know, * to indicate bold is acceptable in the text > > versions. > > Sometimes * indicates a footnote; how do you distinguish using automated > tools? > > Jonathan > > -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! > Eukleia: Jonathan Walther > Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) > Contact: 604-582-9308 (between 7am and 11pm, PST) > Website: http://reactor-core.org/ > > Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery > Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery > > It's not true unless it makes you laugh, > but you don't understand it until it makes you weep. << signature.asc >> > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From hart at pglaf.org Fri Jan 21 11:48:07 2005 From: hart at pglaf.org (Michael Hart) Date: Fri Jan 21 11:48:08 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <20050121194521.579C44F51D@ws6-5.us4.outblaze.com> References: <20050121194521.579C44F51D@ws6-5.us4.outblaze.com> Message-ID: On Fri, 21 Jan 2005, Joshua Hutchinson wrote: > * should be used to indicate a footnote in a PG text (though there may be OLD > examples of doing such). For instance, all DP-generated text's footnotes > would mark the * as [A]. A numbered footnote would look like this [1]. > > Chances are, if any old texts use * as a footnote marker, when they get > updated, that will probably be updated as well. Some have used [FN1] to insure against false hits if the author uses [1] for other purposes. Michael > > Josh > > > ----- Original Message ----- > From: "Jonathan Walther" > To: "Project Gutenberg Volunteer Discussion" > Subject: Re: [gutvol-d] Marking Bold in text files > Date: Fri, 21 Jan 2005 11:38:13 -0800 > >> >> On Fri, Jan 21, 2005 at 02:36:35PM -0500, Joshua Hutchinson wrote: >>> As far as I know, * to indicate bold is acceptable in the text >>> versions. >> >> Sometimes * indicates a footnote; how do you distinguish using automated >> tools? >> >> Jonathan >> >> -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! >> Eukleia: Jonathan Walther >> Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) >> Contact: 604-582-9308 (between 7am and 11pm, PST) >> Website: http://reactor-core.org/ >> >> Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery >> Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery >> >> It's not true unless it makes you laugh, >> but you don't understand it until it makes you weep. > << signature.asc >> > >> >> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hacker at gnu-designs.com Fri Jan 21 11:46:56 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Fri Jan 21 11:48:16 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> References: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> Message-ID: > One of those messages, dated Nov 12, 2004, indicated that the PG > FAQ's would be updated to indicate the use of asterisks (*) to mark > bold text, similar to PG's FAQ V.94's standard of using underscores > to indicate italicized text. Was this update ever done? I can't help but notice that this is the exact opposite of the Markdown syntax and format used by many "wiki" forums and online web applications. Any particular reason for reinventing this wheel again? Why not reuse one of the existing systems for plain-text markup, that is ultimately easier to parse out for further conversion back to any other format (bbcode, Markdown, wikitext). Just my 0.02 Euros. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From joshua at hutchinson.net Fri Jan 21 11:49:24 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 11:49:32 2005 Subject: [gutvol-d] PT1 History of PG Message-ID: <20050121194924.9F8C7109922@ws6-4.us4.outblaze.com> Believe what? That Moore's Law was never meant to be applied to PG? Yes, I believe that. That Moore's Law is being applied arbitrarily to PG production? Yes, I believe that. That Moore's Law is not being adhered to by our production EXCEPT for the very specific and arbitrary start date chosen by you? Yes, I believe that. Do I believe it is a big deal? No. It is obviously a marketing gimmick with no real value. MOST marketing gimmicks "cook the numbers" in some way, so this just adheres to the grand traditions of marketing/public relations. Josh ----- Original Message ----- From: "Michael Hart" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] PT1 History of PG Date: Fri, 21 Jan 2005 11:39:15 -0800 (PST) > > > Just a simple question. . .how many people believe any of this? > > Should I really go through the motions of refuting it again, > and again, and again? > > As I said privately, offline, I don't think even the speaker > believes what he is saying. . . . > > Michael > > > On Thu, 20 Jan 2005, Marcello Perathoner wrote: > > > Michael Hart wrote: > > > >> Obviously no one ever seriously considered that Project Gutenberg > >> might actually release a million eBooks in 2001, but there were a > >> few examples recently of suggestions that we should have used the > >> 1971 date, and thus the resultant figures listed above when doing > >> our Moore's Law predictions. > > > > [epighraph] > > > > "Contrary to popular claims, it appears that the common versions of > > Moore's Law have not been valid during the last decades. As > > semiconductors are becoming important in economy and society, Moore's > > Law is now becoming an increasingly misleading predictor of future > > developments." > > > > ... > > > > "Indeed, sociologically Moore's Law is a fascinating case of how myths are > > manufactured in the modern society and how such myths rapidly propagate into > > scientific articles, speeches of leading industrialists, and government > > policy reports around the world." > > > > http://firstmonday.org/issues/issue7_11/tuomi/index.html > > > > [/epigraph] > > > > Read that page for the sad truth about Moore's "Law". > > > > > > My suggestion was to stop using arbitrary data to keep up the illusion > > of Moore's Law (which, if you had read that page, would have known never > > worked even for computers) but to use real data to show that Moore's > > "Law" does not fit to PG production. > > > > The suggestion was to use the real date the project started (1971) > > instead of your fictitious and arbitrary one (1990). > > > > Of course, using real dates, the idea that PG production followed > > Moore's Law dies a horrible death. > > > > > > Even looking at the relatively short period of Nov 2003 to Nov 2004 we > > can prove in a very simple manner that Moore's Law doesn't hold. > > > > 1. In Nov 2003 we had 10000 books. > > > > 2. Applying Moore's Law, in Nov 2004 we should have had > > 10000 * 2 ^ (12/18) = 15874 books. > > > > 3. Moore's Law does not hold for PG archive size. > > > > QED > > > > > > The sad fact is: some people with a marketing person's mind prefer to > > stick to a phony and proven wrong "Law" because it is such a slick > > formulation. > > > > Why do we need flashy formulas at all? If we say: "we got 15000 books > > today" isn't that enough? Don't you have faith in the facts? > > > > > > -- Marcello Perathoner > > webmaster@gutenberg.org > > > > > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Fri Jan 21 11:51:40 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 11:51:48 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <20050121195140.4448C4F521@ws6-5.us4.outblaze.com> Well, technically, we were here first... They reinvented OUR wheel! ;) PG has been using _ to indicate italics for as long as I can remember. Josh ----- Original Message ----- From: "David A. Desrosiers" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] Marking Bold in text files Date: Fri, 21 Jan 2005 14:46:56 -0500 (EST) > > > > One of those messages, dated Nov 12, 2004, indicated that the PG FAQ's would > > be updated to indicate the use of asterisks (*) to mark bold text, similar > > to PG's FAQ V.94's standard of using underscores to indicate italicized > > text. Was this update ever done? > > I can't help but notice that this is the exact opposite of the > Markdown syntax and format used by many "wiki" forums and online web > applications. Any particular reason for reinventing this wheel again? > Why not reuse one of the existing systems for plain-text markup, that > is ultimately easier to parse out for further conversion back to any > other format (bbcode, Markdown, wikitext). > > Just my 0.02 Euros. > > David A. Desrosiers > desrod@gnu-designs.com > http://gnu-designs.com > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Fri Jan 21 12:05:48 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 12:05:55 2005 Subject: [gutvol-d] Moving and Removing eBooks Message-ID: <20050121200548.A63EF109928@ws6-4.us4.outblaze.com> ----- Original Message ----- From: "Michael Hart" > > On Thu, 20 Jan 2005, Joshua Hutchinson wrote: > > > I do remember a discussion last month about a file format that is completely > > unaccesible (the reader no longer exists). There is one or two ebooks in > > the PG collection in this format. There was call to deprecate those > > versions into an OLD subfolder (or something like that) so that people > > coming to the site weren't confused. > > Everyone refers to their own suggestions as "better". . .as "reform," etc. > Well ... duh! Do you honestly think people would suggest things that they think are "worse" than what is already being done? The test is whether OTHER people think those ideas are "better". > > No one advocated deletion, only better cataloging. > > As for "only better cataloging". . .the obvious thing is simply to point > plainly to both versions, with a note that the Folio format requires a > proprietary reader. > Allow me to use our very first ebook as an example of how we don't link to everything available on many things already. http://www.gutenberg.org/etext/1 As you can see at the above link, we have a link to different formats. However, we only show one edition of each one. For instance, the plain text version links to edition 12. However, if you go look at the etext90 directory, you'll see that there is an edition 11 available in plain text. We don't link to it, though. You see, we keep old stuff, but we don't have to link to it from the bibrec pages. That is all people were advocating in this particular case. We have a precedent of deprecating some things. Why is this particular case any different (especially since this has more potential to cause confusion)? > > That is probably where Michael's dozen is coming from, because many people > > thought this was a sensible step to take (and basically only Michael thought > > it wasn't). > > Actually, the dozen comes from various discussions we've had over time. > Fair enough. It wasn't clear in your original post what "dozens" were referring to. I apologize for putting words in your mouth. > > Some of us try to remember the past as we plan for the future. > Rather a non-sensical thing to say in this context. No one advocated forgetting the past (i.e., deleting anything). They did advocate reorganizing how we access that past data. Josh From krooger at debian.org Fri Jan 21 12:19:47 2005 From: krooger at debian.org (Jonathan Walther) Date: Fri Jan 21 12:19:56 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <20050121195140.4448C4F521@ws6-5.us4.outblaze.com> References: <20050121195140.4448C4F521@ws6-5.us4.outblaze.com> Message-ID: <20050121201947.GA13223@reactor-core.org> On Fri, Jan 21, 2005 at 02:51:40PM -0500, Joshua Hutchinson wrote: >Well, technically, we were here first... They reinvented OUR wheel! ;) > >PG has been using _ to indicate italics for as long as I can remember. Is each word individually italicized? It would look messy, but be robust. Jonathan -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! Eukleia: Jonathan Walther Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) Contact: 604-582-9308 (between 7am and 11pm, PST) Website: http://reactor-core.org/ Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery It's not true unless it makes you laugh, but you don't understand it until it makes you weep. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: Digital signature Url : http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050121/82dd9dc7/attachment.bin From jmdyck at ibiblio.org Fri Jan 21 12:14:45 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Fri Jan 21 12:26:47 2005 Subject: [gutvol-d] Moving and Removing eBooks References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> Message-ID: <41F162B5.AB0217D0@ibiblio.org> Michael Hart wrote: > > there are at least a dozen or two very outspoken volunteers at Project > Gutenberg among a dozen or two thousand of such volunteers, who would > prefer to delete many of the original Project Gutenberg eBooks > > ... the removing refers to suggestions ... > including eBook #100, the Complete Works of Shakespeare. > > ... the dozen comes from various discussions we've had over time. > > Some of us try to remember the past as we plan for the future. I'm trying to remember the past, but so far I'm not remembering it as you do. Could you be a bit more specific, to help jog my memory, and provide a basis for searching the archive? Could you name two or three other ebooks whose deletion was sought/recommended/suggested? -Michael From joshua at hutchinson.net Fri Jan 21 12:34:54 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 12:35:03 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <20050121203454.9D07B4F531@ws6-5.us4.outblaze.com> ----- Original Message ----- From: "Jonathan Walther" > > On Fri, Jan 21, 2005 at 02:51:40PM -0500, Joshua Hutchinson wrote: > > Well, technically, we were here first... They reinvented OUR wheel! ;) > > > > PG has been using _ to indicate italics for as long as I can remember. > > Is each word individually italicized? It would look messy, but be > robust. > No, each section of italicized words is marked. EXAMPLE: This is a sentence with _an italicized phrase_. This is a sentence with *a bolded phrase*. **** The obvious problem with this encoding is that it can be very difficult to determine that every opening _ is closed by the appropriate _. That is one of the reasons DP uses and in the proofing interface instead. Josh From ajhaines at shaw.ca Fri Jan 21 13:39:20 2005 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Fri Jan 21 13:39:29 2005 Subject: [gutvol-d] Marking Bold in text files References: <20050121203454.9D07B4F531@ws6-5.us4.outblaze.com> Message-ID: <004f01c50001$aa8a6110$6401a8c0@ahainesp2600> Hmm... Hopefully I haven't opened *too* big a can of worms here , so I'll re-cast my question in terms of the specific book I'm working on. It's a textbook on selected poems of Tennyson and Wordsworth. The poems have line reference numbers every 10 lines, and those and interim line numbers are used to refer to footnotes. Each footnote consists of the line number (not bolded), then a fragment of that line of the poem (bolded), then the footnote body (not bolded) discussing the fragment. It's the use of asterisks around the fragment portion of the footnote that my question was directed to. ----- Original Message ----- From: "Joshua Hutchinson" To: "Project Gutenberg Volunteer Discussion" Sent: Friday, January 21, 2005 12:34 PM Subject: Re: [gutvol-d] Marking Bold in text files ----- Original Message ----- From: "Jonathan Walther" > > On Fri, Jan 21, 2005 at 02:51:40PM -0500, Joshua Hutchinson wrote: > > Well, technically, we were here first... They reinvented OUR wheel! ;) > > > > PG has been using _ to indicate italics for as long as I can remember. > > Is each word individually italicized? It would look messy, but be > robust. > No, each section of italicized words is marked. EXAMPLE: This is a sentence with _an italicized phrase_. This is a sentence with *a bolded phrase*. **** The obvious problem with this encoding is that it can be very difficult to determine that every opening _ is closed by the appropriate _. That is one of the reasons DP uses and in the proofing interface instead. Josh _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Fri Jan 21 13:45:12 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jan 21 13:45:22 2005 Subject: [gutvol-d] Marking Bold in text files Message-ID: <20050121214512.C89342F8A4@ws6-3.us4.outblaze.com> Well, this is how I would handle it... :) [Footnote 1: *poem fragment* The footnote text itself.] The footnote number would be the poem line number it refers to. JHutch ----- Original Message ----- From: "Al Haines (shaw)" To: "Project Gutenberg Volunteer Discussion" Subject: Re: [gutvol-d] Marking Bold in text files Date: Fri, 21 Jan 2005 13:39:20 -0800 > > Hmm... Hopefully I haven't opened *too* big a can of worms here , so I'll > re-cast my question in terms of the specific book I'm working on. > > It's a textbook on selected poems of Tennyson and Wordsworth. The poems have > line reference numbers every 10 lines, and those and interim line numbers are > used to refer to footnotes. > > Each footnote consists of the line number (not bolded), then a fragment of > that line of the poem (bolded), then the footnote body (not bolded) discussing > the fragment. > > It's the use of asterisks around the fragment portion of the footnote that my > question was directed to. > > > ----- Original Message ----- From: "Joshua Hutchinson" > To: "Project Gutenberg Volunteer Discussion" > Sent: Friday, January 21, 2005 12:34 PM > Subject: Re: [gutvol-d] Marking Bold in text files > > > > ----- Original Message ----- > From: "Jonathan Walther" > > > > On Fri, Jan 21, 2005 at 02:51:40PM -0500, Joshua Hutchinson wrote: > > > Well, technically, we were here first... They reinvented OUR wheel! ;) > > > > > > PG has been using _ to indicate italics for as long as I can remember. > > > > Is each word individually italicized? It would look messy, but be > > robust. > > > > No, each section of italicized words is marked. > > EXAMPLE: > > This is a sentence with _an italicized phrase_. > > This is a sentence with *a bolded phrase*. > > **** > > The obvious problem with this encoding is that it can be very difficult to > determine that every opening _ is closed by the appropriate _. That is one of > the reasons DP uses and in the proofing interface instead. > > Josh > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From donovan at abs.net Fri Jan 21 17:19:22 2005 From: donovan at abs.net (D Garcia) Date: Fri Jan 21 17:20:18 2005 Subject: [gutvol-d] Marking Bold in text files In-Reply-To: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> References: <000a01c4ffec$af0435f0$6401a8c0@ahainesp2600> Message-ID: <200501212019.22795.donovan@abs.net> On Friday 21 January 2005 02:09 pm, Al Haines (shaw) wrote: > One of those messages, dated Nov 12, 2004, indicated that the PG FAQ's > would be updated to indicate the use of asterisks (*) to mark bold text, > similar to PG's FAQ V.94's standard of using underscores to indicate > italicized text. Was this update ever done? Why would PG decide to use a character commonly used to denote an OCR error as the bold type marker in ASCII files? Even gutcheck flags this as a possible error, and with it used that way, the WW and others will have to wade through a bunch of false positives from the output, or jim (poor guy) would have to add even more parsing to the gutcheck routines to try to reduce these false positives. Guess what's more likely to happen? I realize that (*) as bold was practice in the historical days of usenet, irc, etc., but it doesn't make any sense to me to use asterisk (*) in this fashion in an ebook. I've been seeing and usually use plus-sign (+) to indicate boldface. David From jtinsley at pobox.com Sat Jan 22 05:36:28 2005 From: jtinsley at pobox.com (Jim Tinsley) Date: Sat Jan 22 05:36:31 2005 Subject: [gutvol-d] Moving and Removing eBooks Message-ID: <20050122133628.GB28508@panix.com> On Thu, 20 Jan 2005 11:34:58 -0700, Jon Noring wrote: >[Note: many of the pre-DP texts need to be redone from scratch for >various reasons. For another project, I'm now working on My Ántonia by >Willa Cather, one of the early PG releases (#242), and the latest PG >edition of it, #11!, is horribly mangled from various edits, without >recourse to the original, during its lifetime. I interpreted this as an errata report, and checked it out. There have been no edits to this book during its lifetime, horrible mangles or otherwise. Edition 11 is just a re-wrapping of edition 10, with the words "THE END" removed from the end of the text. I have no idea why someone felt it desirable to rewrap the original text, since the original looks perfectly good to me; however, it was done. The date on edition 10 is Feb 16 1995, which is consistent with its original posting, so if myant10 was ever changed at all, it was within weeks of first posting, and not since -- not a shock; Judy always did good work! Apart from spaces, blank lines, and indents consequent to the re-wrapping, the two texts are identical, as anyone who checks can easily verify. jim From hart at pglaf.org Sat Jan 22 11:25:16 2005 From: hart at pglaf.org (Michael Hart) Date: Sat Jan 22 11:25:18 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <41F162B5.AB0217D0@ibiblio.org> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> Message-ID: On Fri, 21 Jan 2005, Michael Dyck wrote: > Michael Hart wrote: >> >> there are at least a dozen or two very outspoken volunteers at Project >> Gutenberg among a dozen or two thousand of such volunteers, who would >> prefer to delete many of the original Project Gutenberg eBooks >> >> ... the removing refers to suggestions ... >> including eBook #100, the Complete Works of Shakespeare. >> >> ... the dozen comes from various discussions we've had over time. >> >> Some of us try to remember the past as we plan for the future. > > I'm trying to remember the past, but so far I'm not remembering it as > you do. Could you be a bit more specific, to help jog my memory, and > provide a basis for searching the archive? Could you name two or three > other ebooks whose deletion was sought/recommended/suggested? These conversations have appeared on various listservers, and over a long period of time, dating all the way back to the Bible #10, Roget's #22, the Bible #30, Sophocles #31, Jekyll and Hyde #42-43, The Gift of the Magi [no # at the time], Frankenstein #84 and 84a, and, of couse, the Complete Shakespeare #100. In more recent times such suggestions have appeared online and offline among the member of "The Book People," "The eBook List," "The PDA-eBook List" and I'm sure there were even more. In addition, there are often such discussions immediately before during and after the release of various items. We are currently discussing how to present the Mahabharata, which as long as the Bible and Shakespeare combined into one book. Some want to present it as a single huge file, while I, having written a paper on it in college, see a great value in presenting it both in a book by book format, as well as one huge file, for a wider range of useful searches. I certainly prefer to have both options available with Shakespeare and the Bible, and we have received many thank you notes for the works David Widger has so kindly prepared in a similar manner. Of course, there is a wide range of years and events between those two paragraphs I mentioned earlier, including all the Dante translations and editions, starting with those surrounding #1,000, and proceeding right up to today, when I received a message stating a new translation is now becoming available. I sent off the copyright permission request, and hopefully we'll have yet another Dante shortly. Those are just the ones off the top of my head, but I am sure we also talked about combining all the Benjamin Jowett translations, and many others, some of which combinations are created elsewhere for scholars to search, some of which come back to us, and some of which I am sure we never hear about at all. In the end everyone is free to create their own PG collection as want and I am sure that include a lot of things people have done that will never become popular, and some that will. Once we get some requests for any certain presentation we always want to give our readers what they want, in addition to what our volunteer base wants for themselves. Michael S. Hart From jmdyck at ibiblio.org Sat Jan 22 12:25:24 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Sat Jan 22 12:26:01 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> Message-ID: <41F2B6B4.FE86DFFA@ibiblio.org> Michael Hart wrote: > > On Fri, 21 Jan 2005, Michael Dyck wrote: > > > Michael Hart wrote: > >> > >> there are at least a dozen or two very outspoken volunteers at Project > >> Gutenberg among a dozen or two thousand of such volunteers, who would > >> prefer to delete many of the original Project Gutenberg eBooks > >> > >> ... the removing refers to suggestions ... > >> including eBook #100, the Complete Works of Shakespeare. > >> > >> ... the dozen comes from various discussions we've had over time. > >> > >> Some of us try to remember the past as we plan for the future. > > > > I'm trying to remember the past, but so far I'm not remembering it as > > you do. Could you be a bit more specific, to help jog my memory, and > > provide a basis for searching the archive? Could you name two or three > > other ebooks whose deletion was sought/recommended/suggested? > > These conversations have appeared on various listservers, and over > a long period of time, dating all the way back to the Bible #10, > Roget's #22, the Bible #30, Sophocles #31, Jekyll and Hyde #42-43, > The Gift of the Magi [no # at the time], Frankenstein #84 and 84a, > and, of couse, the Complete Shakespeare #100. > > ... > > In addition, there are often such discussions immediately before during > and after the release of various items. We are currently discussing how > to present the Mahabharata, which as long as the Bible and Shakespeare > combined into one book. Some want to present it as a single huge file, > while I, having written a paper on it in college, see a great value in > presenting it both in a book by book format, as well as one huge file, > for a wider range of useful searches. I certainly prefer to have both > options available with Shakespeare and the Bible, and we have received > many thank you notes for the works David Widger has so kindly prepared > in a similar manner. It sounds like this is talking about cases where there's discussion about whether to post a submission as a single ebook or as multiple parts (or both). Whereas I thought we were talking about cases where someone advocates the removal of a submission (or the replacement of one with another). Do you see these as the same issue? They seem quite different to me. -Michael Dyck From shalesller at writeme.com Sun Jan 23 00:25:02 2005 From: shalesller at writeme.com (D. Starner) Date: Sun Jan 23 00:25:22 2005 Subject: !@! Re: [gutvol-d] Moving and Removing eBooks Message-ID: <20050123082502.B5F5F4BDAA@ws1-1.us4.outblaze.com> Michael Hart writes: > This is why we do not "Move or Remove" eBooks from > the more visble locations to the less visible. We do everytime we make a new edition of a book; we move the old edition from the default link to something you'd have to dig around in the directory tree does. Darn good thing, too. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From sly at victoria.tc.ca Sun Jan 23 00:41:19 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Jan 23 00:41:42 2005 Subject: !@! Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <20050123082502.B5F5F4BDAA@ws1-1.us4.outblaze.com> References: <20050123082502.B5F5F4BDAA@ws1-1.us4.outblaze.com> Message-ID: On Sun, 23 Jan 2005, D. Starner wrote: > Michael Hart writes: > > This is why we do not "Move or Remove" eBooks from > > the more visble locations to the less visible. > > We do everytime we make a new edition of a book; we move the old > edition from the default link to something you'd have to dig > around in the directory tree does. Darn good thing, too. This is where multiple files can make it interesting. I just finished dealing with an email from someone pointing out that for a certain Jules Verne title (#1842) the catalog showed a version 11 plain text and zipped text file only. Someone had produced an html file as well for the 10 version, but as the catalog will generally show only the most recent version, it was not listed. So in this case, hading the older file meant that people may not have been aware they could get the same text in html as well (although, presumably, in a less accurate reading) Andrew From bruce at zuhause.org Sun Jan 23 08:57:32 2005 From: bruce at zuhause.org (Bruce Albrecht) Date: Sun Jan 23 08:57:39 2005 Subject: !@! Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: References: <20050123082502.B5F5F4BDAA@ws1-1.us4.outblaze.com> Message-ID: <16883.55164.572548.798869@celery.zuhause.org> Andrew Sly writes: > On Sun, 23 Jan 2005, D. Starner wrote: > > > Michael Hart writes: > > > This is why we do not "Move or Remove" eBooks from > > > the more visble locations to the less visible. > > > > We do everytime we make a new edition of a book; we move the old > > edition from the default link to something you'd have to dig > > around in the directory tree does. Darn good thing, too. > > This is where multiple files can make it interesting. > I just finished dealing with an email from someone pointing > out that for a certain Jules Verne title (#1842) the catalog > showed a version 11 plain text and zipped text file only. > Someone had produced an html file as well for the 10 version, > but as the catalog will generally show only the most recent > version, it was not listed. So in this case, hading the > older file meant that people may not have been aware they > could get the same text in html as well (although, > presumably, in a less accurate reading) Or it could be an error that was present in the version 10 etext that was not present in the version 10 html. I reported a problem a few months ago in an HTML version of a book where a phrase was missing, but was in the etext version. It's been fixed, but I don't see whether it's marked as a different version. From hart at pglaf.org Sun Jan 23 08:59:21 2005 From: hart at pglaf.org (Michael Hart) Date: Sun Jan 23 08:59:22 2005 Subject: !@! Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: References: <20050123082502.B5F5F4BDAA@ws1-1.us4.outblaze.com> Message-ID: On Sun, 23 Jan 2005, Andrew Sly wrote: > > > On Sun, 23 Jan 2005, D. Starner wrote: > >> Michael Hart writes: >>> This is why we do not "Move or Remove" eBooks from >>> the more visble locations to the less visible. >> >> We do everytime we make a new edition of a book; we move the old >> edition from the default link to something you'd have to dig >> around in the directory tree does. Darn good thing, too. > > This is where multiple files can make it interesting. > I just finished dealing with an email from someone pointing > out that for a certain Jules Verne title (#1842) the catalog > showed a version 11 plain text and zipped text file only. > Someone had produced an html file as well for the 10 version, > but as the catalog will generally show only the most recent > version, it was not listed. So in this case, hading the > older file meant that people may not have been aware they > could get the same text in html as well (although, > presumably, in a less accurate reading) This is why the original catalog had an entry for each version, and each version was kept in the same directory as the others: the readers could just pick the ones with the highest numbers to get the ones with the most corrections, and could also see the entire history of the eBook after initial release. mh From hart at pglaf.org Sun Jan 23 10:00:14 2005 From: hart at pglaf.org (Michael Hart) Date: Sun Jan 23 10:00:15 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <41F2B6B4.FE86DFFA@ibiblio.org> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F2B6B4.FE86DFFA@ibiblio.org> Message-ID: On Sat, 22 Jan 2005, Michael Dyck wrote: > Michael Hart wrote: >> >> On Fri, 21 Jan 2005, Michael Dyck wrote: >> >>> Michael Hart wrote: >>>> >>>> there are at least a dozen or two very outspoken volunteers at Project >>>> Gutenberg among a dozen or two thousand of such volunteers, who would >>>> prefer to delete many of the original Project Gutenberg eBooks >>>> >>>> ... the removing refers to suggestions ... >>>> including eBook #100, the Complete Works of Shakespeare. >>>> >>>> ... the dozen comes from various discussions we've had over time. >>>> >>>> Some of us try to remember the past as we plan for the future. >>> >>> I'm trying to remember the past, but so far I'm not remembering it as >>> you do. Could you be a bit more specific, to help jog my memory, and >>> provide a basis for searching the archive? Could you name two or three >>> other ebooks whose deletion was sought/recommended/suggested? >> >> These conversations have appeared on various listservers, and over >> a long period of time, dating all the way back to the Bible #10, >> Roget's #22, the Bible #30, Sophocles #31, Jekyll and Hyde #42-43, >> The Gift of the Magi [no # at the time], Frankenstein #84 and 84a, >> and, of couse, the Complete Shakespeare #100. >> >> ... >> >> In addition, there are often such discussions immediately before during >> and after the release of various items. We are currently discussing how >> to present the Mahabharata, which as long as the Bible and Shakespeare >> combined into one book. Some want to present it as a single huge file, >> while I, having written a paper on it in college, see a great value in >> presenting it both in a book by book format, as well as one huge file, >> for a wider range of useful searches. I certainly prefer to have both >> options available with Shakespeare and the Bible, and we have received >> many thank you notes for the works David Widger has so kindly prepared >> in a similar manner. > > It sounds like this is talking about cases where there's discussion > about whether to post a submission as a single ebook or as multiple > parts (or both). Whereas I thought we were talking about cases where > someone advocates the removal of a submission (or the replacement of > one with another). Do you see these as the same issue? They seem quite > different to me. Sorry, I should have been more specific about these examples. In many case people wanted to delete files we were offering or preparing to offer, or to move all of a certain class of work into one single file, and delete the individual files. This sort of discussion happens more often than you might think. Here are a more examples I came up with overnight: The US Bill Of Rights Peter Pan The Night Before Christmas Far From The Madding Crowd President Clinton's First Inaugural Speech The Little Prince The 11th Britannica The Oxford Book Of English Verse The Voyages Of Dr. Dolittle Siddhartha Gone With The Wind The Ragged Trousered Philanthropists Not to mention JRR Tolkien. From jmdyck at ibiblio.org Sun Jan 23 13:14:57 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Sun Jan 23 13:15:37 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F2B6B4.FE86DFFA@ibiblio.org> Message-ID: <41F413D1.ACA56CB6@ibiblio.org> Michael Hart wrote: > > On Sat, 22 Jan 2005, Michael Dyck wrote: > > > > It sounds like this is talking about cases where there's discussion > > about whether to post a submission as a single ebook or as multiple > > parts (or both). Whereas I thought we were talking about cases where > > someone advocates the removal of a submission (or the replacement of > > one with another). Do you see these as the same issue? They seem quite > > different to me. > > Sorry, I should have been more specific about these examples. > > In many case people wanted to delete files we were offering or > preparing to offer, or to move all of a certain class of work > into one single file, and delete the individual files. This > sort of discussion happens more often than you might think. Right. I understand that there can be debate about -- whether PG has the legal right to post a work, and -- the best way to package a (long or multi-part) work, but neither of these sounds particularly Orwellian. What I was talking about, and what (it seems) you were talking about when you started this thread (about "rewriting history"), was the removal of existing PG texts, and their replacement by different texts (i.e., different editions, not just a repackaging of the same content). You gave the example of ebook #100, the Complete Works of Shakespeare, whose removal was recently suggested, and analogized this to the hypothetical submission of the Britannica 11th this year, only to have it removed in a decade. So, leaving aside cases where someone says "You don't have the necessary permission to put that work online", or "This work would be better split into parts / joined into one", do you have examples (like ebook #100) where someone has advocated the removal of an existing PG text, and (optionally) its replacement by a significantly different text? -Michael Dyck From marcello at perathoner.de Mon Jan 24 11:33:24 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 24 11:33:30 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> Message-ID: <41F54D84.2080603@perathoner.de> Michael Hart wrote: > In more recent times such suggestions have appeared online and offline > among the member of "The Book People," "The eBook List," "The PDA-eBook > List" and I'm sure there were even more. You spoke of "PG volunteers". These sound to me as people external to PG. > In addition, there are often such discussions immediately before during > and after the release of various items. We are currently discussing how > to present the Mahabharata, which as long as the Bible and Shakespeare > combined into one book. Some want to present it as a single huge file, > while I, having written a paper on it in college, see a great value in > presenting it both in a book by book format, as well as one huge file, > for a wider range of useful searches. I don't think presenting the Mahabharata as single file qualifies as "Orwellian Rewriting of History". I also believe, the proposed deleting of PG files should be re-classified under "Arsons of Alexandria". -- Marcello Perathoner Member of the 12 Outspoken From laurent.leguillou at gmail.com Mon Jan 24 13:10:22 2005 From: laurent.leguillou at gmail.com (Laurent LE GUILLOU) Date: Mon Jan 24 13:10:44 2005 Subject: [gutvol-d] PT1 History of PG In-Reply-To: <20050121194924.9F8C7109922@ws6-4.us4.outblaze.com> References: <20050121194924.9F8C7109922@ws6-4.us4.outblaze.com> Message-ID: <41F5643E.70208@gmail.com> Joshua Hutchinson a ?crit : >Believe what? >That Moore's Law was never meant to be applied to PG? Yes, I believe that. >That Moore's Law is being applied arbitrarily to PG production? Yes, I believe that. >That Moore's Law is not being adhered to by our production EXCEPT for the very specific and arbitrary start date chosen by you? Yes, I believe that. >[...] > > Like you, I believe that those references to a so-called "Moore law" have no real meaning, and are completely useless for the project. Announcing: "We have now 15000 books" is strong enough, there is no need to make pseudo-scientific projections for the past and the future according to this "Moore-law" non-sense ... Laurent From hart at pglaf.org Tue Jan 25 07:34:58 2005 From: hart at pglaf.org (Michael Hart) Date: Tue Jan 25 07:35:01 2005 Subject: [gutvol-d] Addition to PG History PT1 Message-ID: Well, it would appear those in question did not read the message above, at least they did not reply to any of the points made, so I am duty bound to explain in a bit more detail, as to how there are no other choices other than Project Gutenberg's first effort at a regular production schedule in 1991 as a starting point. Here are various starting years and results for Moore's Law used to project the growth of Project Gutenberg. START TOTAL START TOTAL START TOTAL START TOTAL YEAR NUMBER YEAR NUMBER YEAR NUMBER YEAR NUMBER ACTUAL 1971 BASE 1971 1979 BASE 1979 1990 BASE 1990 1993 BASE 1993 NUMBER/YR 1971 1 1 1971 1974 4 4 1974 1977 16 7 1977 1980 64 1979 9 9 1980 1983 256 1982 36 9 1983 1986 1024 1985 144 10 1986 1989 4096 1988 576 1990 10 10 1989 1992 16384 1991 2304 1993 40 1993 100 42 1992 1995 65536 1994 9216 1996 160 1996 400 365 1995 1998 262144 1997 36864 1999 640 1999 1600 1550 1998 2001 1480576 2000 147456 2002 2560 2002 25600 4260 2001 2004 4194304 2003 589824 2005 10240 2005 102400 14944 2004 2007 16777216 2006 2359296 2008 40960 2008 409600 ????? 2007 As you can plainly see, starting Moore's Law at any other date than ~1991 is inappropriate not only because the numbers make it obvious, but also because 1991 was the first year of regular production growth for Project Gutenberg. From nwolcott at dsdial.net Tue Jan 25 08:23:13 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Tue Jan 25 08:23:59 2005 Subject: [gutvol-d] etext 1842 Michael Sstrogoff Message-ID: <000a01c502fa$39d3f980$119495ce@gw98> The catalogue currently indicates an 1842 -10 html and 1842-11 txt. Previously I had a 1842-10 txt which no longet appears. These appear to be all the same text, although 1842-10.txt and t842-11.txt differ in length by 2K bytes (change in header?). The source for all was apparently Judy Boss, and as there will be no errors in her text, unless she herself submitted it twice, all the texts should be identical. Since the html was converted from Boss's text there is the possibility of an error in the conversion. The html capitalizes the 1st letter of each chapter, not in the original etext. Otherwise a robot conversion. Too bad all the versions do not appear on the catalogue list. N Wolcott nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050125/55dbc618/attachment.html From joshua at hutchinson.net Tue Jan 25 08:29:27 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jan 25 08:29:32 2005 Subject: [gutvol-d] Addition to PG History PT1 Message-ID: <20050125162927.29099109970@ws6-4.us4.outblaze.com> *sigh* The only thing your chart proves is that applying Moore's Law to PG production is a waste of time. 1991 is NOT when production started. It is not when PG started. It is just the point in time you chose as the starting point to attach Moore's Law. What people are saying is this: MOORE'S LAW DOES NOT APPLY TO PG PRODUCTION. WE WOULD BE BETTER OFF JUST ANNOUNCING WE HAVE "X" NUMBER OF TEXTS INSTEAD OF COMPARING IT TO A "LAW" THAT ISN'T EVEN MEANT TO BE USED THIS WAY. Admit it, Michael, 1991 is just an arbritrary date. Just because "regular production" started in 1991 ... bull. We had regular, once a year, production before that. Josh PS BTW, if you are going to say no one "read the message above," you might want to actually quote what we supposedly didn't read. As it is, I'm assuming we all read the same thing as you, but had a slightly different level of reading comprehension, since we all seemed to get something completely different from it than you did. ----- Original Message ----- From: "Michael Hart" To: "The gutvol-d Mailing List" Subject: [gutvol-d] Addition to PG History PT1 Date: Tue, 25 Jan 2005 07:34:58 -0800 (PST) > > > Well, it would appear those in question did not read the message > above, at least they did not reply to any of the points made, so > I am duty bound to explain in a bit more detail, as to how there > are no other choices other than Project Gutenberg's first effort > at a regular production schedule in 1991 as a starting point. > > Here are various starting years and results for Moore's Law used > to project the growth of Project Gutenberg. > > > START TOTAL START TOTAL START TOTAL START TOTAL > YEAR NUMBER YEAR NUMBER YEAR NUMBER YEAR NUMBER ACTUAL > 1971 BASE 1971 1979 BASE 1979 1990 BASE 1990 1993 BASE 1993 NUMBER/YR > > 1971 1 1 1971 > 1974 4 4 1974 > 1977 16 7 1977 > 1980 64 1979 9 9 1980 > 1983 256 1982 36 9 1983 > 1986 1024 1985 144 10 1986 > 1989 4096 1988 576 1990 10 10 1989 > 1992 16384 1991 2304 1993 40 1993 100 42 1992 > 1995 65536 1994 9216 1996 160 1996 400 365 1995 > 1998 262144 1997 36864 1999 640 1999 1600 1550 1998 > 2001 1480576 2000 147456 2002 2560 2002 25600 4260 2001 > 2004 4194304 2003 589824 2005 10240 2005 102400 14944 2004 > 2007 16777216 2006 2359296 2008 40960 2008 409600 ????? 2007 > > As you can plainly see, starting Moore's Law at any other date than ~1991 is > inappropriate not only because the numbers make it obvious, but also because > 1991 was the first year of regular production growth for Project Gutenberg. > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Tue Jan 25 08:34:22 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jan 25 08:34:27 2005 Subject: [gutvol-d] etext 1842 Michael Sstrogoff Message-ID: <20050125163422.B99204F4BD@ws6-5.us4.outblaze.com> Forgive me, but I'm not real sure what the point of this was. strgf10.txt and strgf11.txt are both in the directory listing. However, like all our texts, the latest version of each format is listed in the catalog entry. This is completely normal. Is this an errata report on the HTML? You mention the first letter of each chapter. If you could make your message a little clearer, I or someone more knowledgable will do our best to figure out the answer. Josh ----- Original Message ----- From: "N Wolcott" To: "Project Gutenberg Volunteer Discussion" Subject: [gutvol-d] etext 1842 Michael Sstrogoff Date: Tue, 25 Jan 2005 11:23:13 -0500 > > The catalogue currently indicates an 1842 -10 html and 1842-11 txt. > Previously I had a 1842-10 txt which no longet appears. These appear to be all > the same text, although 1842-10.txt and t842-11.txt differ in length by 2K > bytes (change in header?). The source for all was apparently Judy Boss, and as > there will be no errors in her text, unless she herself submitted it twice, > all the texts should be identical. Since the html was converted from Boss's > text there is the possibility of an error in the conversion. The html > capitalizes the 1st letter of each chapter, not in the original etext. > Otherwise a robot conversion. Too bad all the versions do not appear on the > catalogue list. > > N Wolcott nwolcott2@post.harvard.edu > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From kouhia at nic.funet.fi Tue Jan 25 08:46:02 2005 From: kouhia at nic.funet.fi (Juhana Sadeharju) Date: Tue Jan 25 08:46:14 2005 Subject: [gutvol-d] Magic books? Message-ID: Hello. A few years ago I purchased old magic books (in public domain) from a web company. I could not find the company anymore. At the time I had money to purchase only about 10 books and wanted to purchase more later. Where I could find magic books either as commercial digitized image files or as free image files or OCR'ed text. That is, if I need to purchase, I want the original look&feel to the books (like I do have now in those 10 books). Otherwise anything goes. The books are about the real magic, performed by Houdini, James Randi, David Copperfield, Pen&Teller, Masked Magician, and many more. I have no idea where I could find original magic books so that I could digitize them myself. (Previously I have scanned a few math books.) Regards, Juhana From hacker at gnu-designs.com Tue Jan 25 08:54:08 2005 From: hacker at gnu-designs.com (David A. Desrosiers) Date: Tue Jan 25 08:54:36 2005 Subject: [gutvol-d] Magic books? In-Reply-To: References: Message-ID: > I have no idea where I could find original magic books so that I > could digitize them myself. (Previously I have scanned a few math > books.) Having been a magician since I was 6 years old (yes, literally), I have amassed a very large collection of printed books on Magic. I even have plenty of the older Osborne illusion plan series, as well as dozens of others (and probably 3-dozen pieces of work autographed by David Copperfield himself, from shows I've attended. I've even been on stage with Copperfield once in Hartford, CT.) Let me know what you're missing, and I'll try to see if anything I have might be able to fill in the gaps. David A. Desrosiers desrod@gnu-designs.com http://gnu-designs.com From hart at pglaf.org Tue Jan 25 09:04:37 2005 From: hart at pglaf.org (Michael Hart) Date: Tue Jan 25 09:04:39 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <41F413D1.ACA56CB6@ibiblio.org> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F2B6B4.FE86DFFA@ibiblio.org> <41F413D1.ACA56CB6@ibiblio.org> Message-ID: On Sun, 23 Jan 2005, Michael Dyck wrote: > Michael Hart wrote: >> >> On Sat, 22 Jan 2005, Michael Dyck wrote: >>> >>> It sounds like this is talking about cases where there's discussion >>> about whether to post a submission as a single ebook or as multiple >>> parts (or both). Whereas I thought we were talking about cases where >>> someone advocates the removal of a submission (or the replacement of >>> one with another). Do you see these as the same issue? They seem quite >>> different to me. >> >> Sorry, I should have been more specific about these examples. >> >> In many case people wanted to delete files we were offering or >> preparing to offer, or to move all of a certain class of work >> into one single file, and delete the individual files. This >> sort of discussion happens more often than you might think. > > Right. I understand that there can be debate about > -- whether PG has the legal right to post a work, and > -- the best way to package a (long or multi-part) work, > but neither of these sounds particularly Orwellian. > > What I was talking about, and what (it seems) you were talking about > when you started this thread (about "rewriting history"), was the > removal of existing PG texts, and their replacement by different texts > (i.e., different editions, not just a repackaging of the same content). > You gave the example of ebook #100, the Complete Works of Shakespeare, > whose removal was recently suggested, and analogized this to the > hypothetical submission of the Britannica 11th this year, only to have > it removed in a decade. > > So, leaving aside cases where someone says "You don't have the necessary > permission to put that work online", or "This work would be better split > into parts / joined into one", do you have examples (like ebook #100) > where someone has advocated the removal of an existing PG text, and > (optionally) its replacement by a significantly different text? The most obvious examples have been, approximately chronlogically: The Bill of Rights, to be subsumed into either The Constitution, or into the complete amendments, back when we were considering separate files for all three. I didn't want to erase the history that the entire Constitution had been deemed too large in the 70s. Frankenstein, some people really hated a particularly bad edition. I don't want to pretend that bad editions don't exist, or that they might have been the only edition available to the original volunteer. I am only too happy to write notes about this in the books and index, but don't want to sweep the whole idea under the carpet. There have always been those who say only their favorite edition should be posted. BTW, this is often a much larger problem with translations, which I did not include much in my list. Shakespeare and The Britannica are obviously the two most recent major examples, and another major example was Darwin. Luckily we had a Darwin expert who insisted on keeping the various editions. As for the recent request to delete Shakespeare #100, I never heard back from my reply, so I never got to the bottom line reasoning behind that request. Perhaps it was only because of copyright, or because it was such an early effort that it needed proofing to bring it up to today's standards. In either case, I don't like the idea of people suggesting we delete files for either reason. As for the Britannica, there have been such wide and varied discussions on this across listservers that there is probably too much to discuss. I haven't insisted that we continue with the Britannica, it was mostly to prove such a thing could be done when we did the first volume, and there are now other source. I would suggest we eventually use one or more of the other sources so we don't create more work for ourselves, but I would never agree that such a seminal work be deleted. People have suggested that some kind of "Reader Advisory" be included, but I don't notice them saying the same thing about Mark Twain, Darwin, or any other works. Perhaps their concern is that the eBook version might be confused with more modern editions. This could possibly be remedied with a statement in each file stating that these articles were written by experts about what they had learned 100 years ago. Personally, _I_ would LOVE to see some comparison between what was thought of as "fact" or "best theory" over periods of time, but I would not approve of sweeping those under the carpet. Oh, a bit more on Shakespeare, I think there was also a discussion on whether we should include the First Folio, since it is full of originaly typography and other difficulties. . .but since it was such as seminal work, we decided to include it. As for translations, obviously there there were unauthorized editions of most of Jules Verne, some just horrible, and I actually agreed not to use one of those, even though I had personally typed half of it in. There is also great, and worthwhile, concern about the Longfellow translation of Dante. However, it is also an example that should be preserved as an indication of history, even if we recommend the Cary translation or any other as being of better quality. I would tend to include both the authorized and unauthorized editions, even, or especially, of such books as Uncle Tom's Cabin, which became the first million-seller, perhaps ONLY because of unauthorized editions. In most of these cases, I prefer to keep the doors open wide, perhaps with warnings, rather than to keep them either closed, or make the readers go through several doorways to find what we have. Thanks! Michael PS BTW, I forgot to mention at least one other example before, the NUSIRG manual. . .we were asked to delete the plain text version. . .at the request of the copyright holder. We considered making an issue of it, since this was the era when the major players were trying to eliminate plain text altogether, but, in the end, we decided that would hurt our future relationships with donors more than it would help in the preservation of plain text. Of course, now nearly everyone can read the file in plain text without the help of all the work we/I did in the conversion. [And this one was VERY difficult to convert, if you take a look.] From marcello at perathoner.de Mon Jan 24 12:53:53 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Jan 25 09:18:10 2005 Subject: [gutvol-d] New canonical URL for etext directories Message-ID: <41F56061.4060501@perathoner.de> The new canonical url: http://www.gutenberg.org/files/12345 will redirect to: http://www.gutenberg.org/dirs/1/2/3/4/12345/ Of course, this works for all ebooks that are stored in the new filesystem. Old ebooks that have not yet been moved to the new filesystem will NOT work. Note: the redirect takes one extra round trip to the server, so the second url is faster. If response time is important, use the second url. -- Marcello Perathoner webmaster@gutenberg.org From sly at victoria.tc.ca Tue Jan 25 09:26:02 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Jan 25 09:26:08 2005 Subject: [gutvol-d] etext 1842 Michael Strogoff In-Reply-To: <000a01c502fa$39d3f980$119495ce@gw98> References: <000a01c502fa$39d3f980$119495ce@gw98> Message-ID: Hi Norm. Until just recently, the catalog only showed two files for #1842: strgf11.txt, and strgf11.zip. This is a result of the catalog working the way it was intended to, and only showing the most recent "edition" of a given text. In this case, there was an html version of the 10 edition which was not updated when the edition 11 text file was posted. At the request of another volunteer, on Sunday, January 23rd, I made a change so that the html and corresponding zip file would display in the catalog as well. If it helps, here is what the catalog "thinks it knows" for files in the archive associated with PG#1842: etext99/strgf10h.htm 1842_ 10 [ ] [HTML________________________] [iso-8859-1____] [none___] 575 KB etext99/strgf10h.zip 1842_ 10 [ ] [HTML________________________] [iso-8859-1____] [zip____] 212 KB etext99/strgf11.txt 1842_ 11 [ ] [Plain text__________________] [unknown_______] [none___] 552 KB etext99/strgf11.zip 1842_ 11 [ ] [Plain text__________________] [unknown_______] [zip____] 202 KB etext99/strgf10.txt 1842_ 10 [X] [Plain text__________________] [unknown_______] [none___] 554 KB etext99/strgf10.zip 1842_ 10 [X] [Plain text__________________] [unknown_______] [zip____] 206 KB On Tue, 25 Jan 2005, N Wolcott wrote: > The catalogue currently indicates an 1842 -10 html and 1842-11 txt. Previously I had a 1842-10 txt which no longet appears. These appear to be all the same text, although 1842-10.txt and t842-11.txt differ in length by 2K bytes (change in header?). The source for all was apparently Judy Boss, and as there will be no errors in her text, unless she herself submitted it twice, all the texts should be identical. Since the html was converted from Boss's text there is the possibility of an error in the conversion. The html capitalizes the 1st letter of each chapter, not in the original etext. Otherwise a robot conversion. Too bad all the versions do not appear on the catalogue list. > > N Wolcott nwolcott2@post.harvard.edu > From hart at pglaf.org Tue Jan 25 09:51:13 2005 From: hart at pglaf.org (Michael Hart) Date: Tue Jan 25 09:51:15 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <41F54D84.2080603@perathoner.de> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F54D84.2080603@perathoner.de> Message-ID: On Mon, 24 Jan 2005, Marcello Perathoner wrote: > Michael Hart wrote: > >> In more recent times such suggestions have appeared online and offline >> among the member of "The Book People," "The eBook List," "The PDA-eBook >> List" and I'm sure there were even more. > > You spoke of "PG volunteers". These sound to me as people external to PG. We have volunteers that converse about PG eBooks on other lists I am on, not to mention there are probably plenty of such conversations on lists I am not on. >> In addition, there are often such discussions immediately before during >> and after the release of various items. We are currently discussing how >> to present the Mahabharata, which as long as the Bible and Shakespeare >> combined into one book. Some want to present it as a single huge file, >> while I, having written a paper on it in college, see a great value in >> presenting it both in a book by book format, as well as one huge file, >> for a wider range of useful searches. > > I don't think presenting the Mahabharata as single file qualifies as > "Orwellian Rewriting of History". I don't think anyone else does, either. > I also believe, the proposed deleting of PG files should be re-classified > under "Arsons of Alexandria". Cute! I love it! However, I have an even more important use for references to modern day book burning. . .re: all those copyright extensions. . . . > > > -- > Marcello Perathoner > Member of the 12 Outspoken Also very cute! Thanks! me From gbnewby at pglaf.org Tue Jan 25 10:31:52 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jan 25 10:31:54 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F2B6B4.FE86DFFA@ibiblio.org> <41F413D1.ACA56CB6@ibiblio.org> Message-ID: <20050125183152.GA11258@pglaf.org> Just a quick addendum (though I'm not sure if it fits Michael's mold): I've received about a dozen requests to remove or expurgate our Rodwell translations of the Koran. These have all come from folks who believed his introduction to the text to be offensive. FWIW. -- Greg On Tue, Jan 25, 2005 at 09:04:37AM -0800, Michael Hart wrote: > > On Sun, 23 Jan 2005, Michael Dyck wrote: > > >Michael Hart wrote: > >> > >>On Sat, 22 Jan 2005, Michael Dyck wrote: > >>> > >>>It sounds like this is talking about cases where there's discussion > >>>about whether to post a submission as a single ebook or as multiple > >>>parts (or both). Whereas I thought we were talking about cases where > >>>someone advocates the removal of a submission (or the replacement of > >>>one with another). Do you see these as the same issue? They seem quite > >>>different to me. > >> > >>Sorry, I should have been more specific about these examples. > >> > >>In many case people wanted to delete files we were offering or > >>preparing to offer, or to move all of a certain class of work > >>into one single file, and delete the individual files. This > >>sort of discussion happens more often than you might think. > > > >Right. I understand that there can be debate about > >-- whether PG has the legal right to post a work, and > >-- the best way to package a (long or multi-part) work, > >but neither of these sounds particularly Orwellian. > > > >What I was talking about, and what (it seems) you were talking about > >when you started this thread (about "rewriting history"), was the > >removal of existing PG texts, and their replacement by different texts > >(i.e., different editions, not just a repackaging of the same content). > >You gave the example of ebook #100, the Complete Works of Shakespeare, > >whose removal was recently suggested, and analogized this to the > >hypothetical submission of the Britannica 11th this year, only to have > >it removed in a decade. > > > >So, leaving aside cases where someone says "You don't have the necessary > >permission to put that work online", or "This work would be better split > >into parts / joined into one", do you have examples (like ebook #100) > >where someone has advocated the removal of an existing PG text, and > >(optionally) its replacement by a significantly different text? > > The most obvious examples have been, approximately chronlogically: > > The Bill of Rights, to be subsumed into either The Constitution, > or into the complete amendments, back when we were considering > separate files for all three. I didn't want to erase the history > that the entire Constitution had been deemed too large in the 70s. > > Frankenstein, some people really hated a particularly bad edition. > I don't want to pretend that bad editions don't exist, or that they > might have been the only edition available to the original volunteer. > I am only too happy to write notes about this in the books and index, > but don't want to sweep the whole idea under the carpet. There have > always been those who say only their favorite edition should be posted. > BTW, this is often a much larger problem with translations, which I > did not include much in my list. > > Shakespeare and The Britannica are obviously the two most recent > major examples, and another major example was Darwin. Luckily we > had a Darwin expert who insisted on keeping the various editions. > > As for the recent request to delete Shakespeare #100, I never heard > back from my reply, so I never got to the bottom line reasoning > behind that request. Perhaps it was only because of copyright, > or because it was such an early effort that it needed proofing > to bring it up to today's standards. > > In either case, I don't like the idea of people suggesting we > delete files for either reason. > > As for the Britannica, there have been such wide and varied > discussions on this across listservers that there is probably > too much to discuss. I haven't insisted that we continue with > the Britannica, it was mostly to prove such a thing could be done > when we did the first volume, and there are now other source. > I would suggest we eventually use one or more of the other sources > so we don't create more work for ourselves, but I would never agree > that such a seminal work be deleted. > > People have suggested that some kind of "Reader Advisory" be included, > but I don't notice them saying the same thing about Mark Twain, Darwin, > or any other works. Perhaps their concern is that the eBook version > might be confused with more modern editions. This could possibly be > remedied with a statement in each file stating that these articles > were written by experts about what they had learned 100 years ago. > > Personally, _I_ would LOVE to see some comparison between what was > thought of as "fact" or "best theory" over periods of time, but I > would not approve of sweeping those under the carpet. > > Oh, a bit more on Shakespeare, I think there was also a discussion > on whether we should include the First Folio, since it is full of > originaly typography and other difficulties. . .but since it was > such as seminal work, we decided to include it. > > As for translations, obviously there there were unauthorized editions > of most of Jules Verne, some just horrible, and I actually agreed not > to use one of those, even though I had personally typed half of it in. > > There is also great, and worthwhile, concern about the Longfellow > translation of Dante. However, it is also an example that should > be preserved as an indication of history, even if we recommend > the Cary translation or any other as being of better quality. > > I would tend to include both the authorized and unauthorized editions, > even, or especially, of such books as Uncle Tom's Cabin, which became > the first million-seller, perhaps ONLY because of unauthorized editions. > > In most of these cases, I prefer to keep the doors open wide, perhaps > with warnings, rather than to keep them either closed, or make the > readers go through several doorways to find what we have. > > Thanks! > > Michael > > > PS > > BTW, I forgot to mention at least one other example before, > the NUSIRG manual. . .we were asked to delete the plain text > version. . .at the request of the copyright holder. We considered > making an issue of it, since this was the era when the major players > were trying to eliminate plain text altogether, but, in the end, > we decided that would hurt our future relationships with donors > more than it would help in the preservation of plain text. > Of course, now nearly everyone can read the file in plain text > without the help of all the work we/I did in the conversion. > [And this one was VERY difficult to convert, if you take a look.] > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From aakman at csufresno.edu Tue Jan 25 10:46:01 2005 From: aakman at csufresno.edu (Alev Akman) Date: Tue Jan 25 10:46:11 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <20050125183152.GA11258@pglaf.org> References: <20050120210144.712F0EDE74@ws6-1.us4.outblaze.com> <41F162B5.AB0217D0@ibiblio.org> <41F2B6B4.FE86DFFA@ibiblio.org> <41F413D1.ACA56CB6@ibiblio.org> <20050125183152.GA11258@pglaf.org> Message-ID: <6.2.0.14.2.20050125103632.02fcdd10@zimmer.csufresno.edu> Isn't this an accepted translation of Koran? I don't see amazon.com withdrawing it from its site. There is a positive review posted for this translation which goes.. "To translate the Quran is an almost impossible task. The beauty of the Arabic language and the complexity of the text makes it very difficult to convey in english. Rodwell's translation uses language closer to that of the King James Bible to help convey the dignity of the Holy Quran. If you want to take a scholarly look at the Quran, you should read several translations. If however, you would like just one version for reference and to get a general idea of what the Quran gives to humanity, this is the book to get." As long as we have various translations of Koran, I don't see a problem with keeping the Rodwell version in PG. On the other hand, as a librarian, I have many objections to its withdrawal. Besides what I remember from my limited religious education, Koran can't be translated; it can be interpreted. In this case, it is just one person's interpretation of Koran. Alev. At 10:31 AM 1/25/2005, you wrote: >Just a quick addendum (though I'm not sure if it fits >Michael's mold): I've received about a dozen requests >to remove or expurgate our Rodwell translations of >the Koran. These have all come from folks who believed >his introduction to the text to be offensive. > >FWIW. > -- Greg > >On Tue, Jan 25, 2005 at 09:04:37AM -0800, Michael Hart wrote: > > > > On Sun, 23 Jan 2005, Michael Dyck wrote: > > > > >Michael Hart wrote: > > >> > > >>On Sat, 22 Jan 2005, Michael Dyck wrote: > > >>> > > >>>It sounds like this is talking about cases where there's discussion > > >>>about whether to post a submission as a single ebook or as multiple > > >>>parts (or both). Whereas I thought we were talking about cases where > > >>>someone advocates the removal of a submission (or the replacement of > > >>>one with another). Do you see these as the same issue? They seem quite > > >>>different to me. > > >> > > >>Sorry, I should have been more specific about these examples. > > >> > > >>In many case people wanted to delete files we were offering or > > >>preparing to offer, or to move all of a certain class of work > > >>into one single file, and delete the individual files. This > > >>sort of discussion happens more often than you might think. > > > > > >Right. I understand that there can be debate about > > >-- whether PG has the legal right to post a work, and > > >-- the best way to package a (long or multi-part) work, > > >but neither of these sounds particularly Orwellian. > > > > > >What I was talking about, and what (it seems) you were talking about > > >when you started this thread (about "rewriting history"), was the > > >removal of existing PG texts, and their replacement by different texts > > >(i.e., different editions, not just a repackaging of the same content). > > >You gave the example of ebook #100, the Complete Works of Shakespeare, > > >whose removal was recently suggested, and analogized this to the > > >hypothetical submission of the Britannica 11th this year, only to have > > >it removed in a decade. > > > > > >So, leaving aside cases where someone says "You don't have the necessary > > >permission to put that work online", or "This work would be better split > > >into parts / joined into one", do you have examples (like ebook #100) > > >where someone has advocated the removal of an existing PG text, and > > >(optionally) its replacement by a significantly different text? > > > > The most obvious examples have been, approximately chronlogically: > > > > The Bill of Rights, to be subsumed into either The Constitution, > > or into the complete amendments, back when we were considering > > separate files for all three. I didn't want to erase the history > > that the entire Constitution had been deemed too large in the 70s. > > > > Frankenstein, some people really hated a particularly bad edition. > > I don't want to pretend that bad editions don't exist, or that they > > might have been the only edition available to the original volunteer. > > I am only too happy to write notes about this in the books and index, > > but don't want to sweep the whole idea under the carpet. There have > > always been those who say only their favorite edition should be posted. > > BTW, this is often a much larger problem with translations, which I > > did not include much in my list. > > > > Shakespeare and The Britannica are obviously the two most recent > > major examples, and another major example was Darwin. Luckily we > > had a Darwin expert who insisted on keeping the various editions. > > > > As for the recent request to delete Shakespeare #100, I never heard > > back from my reply, so I never got to the bottom line reasoning > > behind that request. Perhaps it was only because of copyright, > > or because it was such an early effort that it needed proofing > > to bring it up to today's standards. > > > > In either case, I don't like the idea of people suggesting we > > delete files for either reason. > > > > As for the Britannica, there have been such wide and varied > > discussions on this across listservers that there is probably > > too much to discuss. I haven't insisted that we continue with > > the Britannica, it was mostly to prove such a thing could be done > > when we did the first volume, and there are now other source. > > I would suggest we eventually use one or more of the other sources > > so we don't create more work for ourselves, but I would never agree > > that such a seminal work be deleted. > > > > People have suggested that some kind of "Reader Advisory" be included, > > but I don't notice them saying the same thing about Mark Twain, Darwin, > > or any other works. Perhaps their concern is that the eBook version > > might be confused with more modern editions. This could possibly be > > remedied with a statement in each file stating that these articles > > were written by experts about what they had learned 100 years ago. > > > > Personally, _I_ would LOVE to see some comparison between what was > > thought of as "fact" or "best theory" over periods of time, but I > > would not approve of sweeping those under the carpet. > > > > Oh, a bit more on Shakespeare, I think there was also a discussion > > on whether we should include the First Folio, since it is full of > > originaly typography and other difficulties. . .but since it was > > such as seminal work, we decided to include it. > > > > As for translations, obviously there there were unauthorized editions > > of most of Jules Verne, some just horrible, and I actually agreed not > > to use one of those, even though I had personally typed half of it in. > > > > There is also great, and worthwhile, concern about the Longfellow > > translation of Dante. However, it is also an example that should > > be preserved as an indication of history, even if we recommend > > the Cary translation or any other as being of better quality. > > > > I would tend to include both the authorized and unauthorized editions, > > even, or especially, of such books as Uncle Tom's Cabin, which became > > the first million-seller, perhaps ONLY because of unauthorized editions. > > > > In most of these cases, I prefer to keep the doors open wide, perhaps > > with warnings, rather than to keep them either closed, or make the > > readers go through several doorways to find what we have. > > > > Thanks! > > > > Michael > > > > > > PS > > > > BTW, I forgot to mention at least one other example before, > > the NUSIRG manual. . .we were asked to delete the plain text > > version. . .at the request of the copyright holder. We considered > > making an issue of it, since this was the era when the major players > > were trying to eliminate plain text altogether, but, in the end, > > we decided that would hurt our future relationships with donors > > more than it would help in the preservation of plain text. > > Of course, now nearly everyone can read the file in plain text > > without the help of all the work we/I did in the conversion. > > [And this one was VERY difficult to convert, if you take a look.] > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > >-- >No virus found in this incoming message. >Checked by AVG Anti-Virus. >Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 1/19/2005 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 1/19/2005 From shalesller at writeme.com Tue Jan 25 13:41:22 2005 From: shalesller at writeme.com (D. Starner) Date: Tue Jan 25 13:41:34 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks Message-ID: <20050125214122.59C804BE64@ws1-1.us4.outblaze.com> "Michael Hart" writes: > As for the recent request to delete Shakespeare #100, I never heard > back from my reply, so I never got to the bottom line reasoning > behind that request. Perhaps it was only because of copyright, > or because it was such an early effort that it needed proofing > to bring it up to today's standards. I sent a message to gutvol-d about it earlier. It's a copyrighted edition of a public domain text, and there's nothing in that file to indicate that the edition was first published in 1931 and isn't just an electronic copy of a public domain edition. And even if it is a 1931 edition, the copyright notices are wrong; electronic editions of texts do not get a new copyright. In many ways, it's the epitome of stuff PG opposes. > There is also great, and worthwhile, concern about the Longfellow > translation of Dante. However, it is also an example that should > be preserved as an indication of history, even if we recommend > the Cary translation or any other as being of better quality. Who cares about Dante; it's Longfellow. Different translations, even different editions are interesting as long as distinguishing marks are included in the file. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From stephen.thomas at adelaide.edu.au Tue Jan 25 14:45:00 2005 From: stephen.thomas at adelaide.edu.au (Steve Thomas) Date: Tue Jan 25 14:45:14 2005 Subject: [gutvol-d] New canonical URL for etext directories In-Reply-To: <41F56061.4060501@perathoner.de> References: <41F56061.4060501@perathoner.de> Message-ID: <41F6CBEC.8020802@adelaide.edu.au> Marcello, I see that you are also redirecting old-syle files, e.g. http://www.gutenberg.org/etext90/getty11h.htm is redirected to http://www.gutenberg.org/dirs/etext90/getty11h.htm Would it not be better to redirect these to the catalog, e.g. http://www.gutenberg.org/etext90/getty* could redirect to http://www.gutenberg.org/etext/4 That would ensure that anyone using old-style links would get directed to the latest version(s) of a work. I'm guessing that you could program the redirect from the old GUTINDEX file. Steve Marcello Perathoner wrote: > The new canonical url: > > http://www.gutenberg.org/files/12345 > > will redirect to: > > http://www.gutenberg.org/dirs/1/2/3/4/12345/ > > Of course, this works for all ebooks that are stored in the new > filesystem. Old ebooks that have not yet been moved to the new > filesystem will NOT work. > > > Note: the redirect takes one extra round trip to the server, so the > second url is faster. If response time is important, use the second url. > > > > -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/ From cannona at fireantproductions.com Tue Jan 25 16:03:12 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Tue Jan 25 16:25:19 2005 Subject: [gutvol-d] anyone wanna' translate? Message-ID: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> Received this message at the cd@pglaf.org address. Anyone care to translate? >To: Project Gutenberg >Subject: Re: Your Project Gutenberg CD/DVD request > >askim! uf ben de daha gecen hafta gutenberg'de geziniyodum. bu maili >alinca anlayamadim o yuzden :) acaba yanlis bi yere mi bastim da boyle >oldu diye merak icinde hatirlamaya calisirken 'kaya'dan seda'ya' >notunu gordum! :) > -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From aakman at csufresno.edu Tue Jan 25 16:32:55 2005 From: aakman at csufresno.edu (Alev Akman) Date: Tue Jan 25 16:33:10 2005 Subject: [gutvol-d] anyone wanna' translate? In-Reply-To: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> References: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> Message-ID: <6.2.0.14.2.20050125162708.03437ff0@zimmer.csufresno.edu> At 04:03 PM 1/25/2005, you wrote: >Received this message at the cd@pglaf.org address. Anyone care to translate? > > > > >>To: Project Gutenberg >>Subject: Re: Your Project Gutenberg CD/DVD request >> >>askim! uf ben de daha gecen hafta gutenberg'de geziniyodum. bu maili >>alinca anlayamadim o yuzden :) acaba yanlis bi yere mi bastim da boyle >>oldu diye merak icinde hatirlamaya calisirken 'kaya'dan seda'ya' >>notunu gordum! :) My love! oof , I was strolling in gutenberg just last week. That's why I could not understand when I received this email. :) I was wondering if I pushed something by mistake and that's why it happened. Trying to remember with wonder, I saw your note of 'from Kaya to Seda.' The original note is in Turkish. I apologize to the sender of this note for outing her but we all need to be more careful of which buttons we push!! Alev. >-- >E-mail: cannona@fireantproductions.com >Skype: cannona >MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail >address.) > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > >-- >No virus found in this incoming message. >Checked by AVG Anti-Virus. >Version: 7.0.300 / Virus Database: 265.7.4 - Release Date: 1/25/2005 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.4 - Release Date: 1/25/2005 From kthagen at polysyllabic.com Tue Jan 25 16:37:33 2005 From: kthagen at polysyllabic.com (Karl Hagen) Date: Tue Jan 25 16:37:50 2005 Subject: [gutvol-d] anyone wanna' translate? In-Reply-To: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> References: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> Message-ID: <41F6E64D.5040701@polysyllabic.com> I can't tell you what it means, but I'm fairly sure it's Turkish. Aaron Cannon wrote: > Received this message at the cd@pglaf.org address. Anyone care to > translate? > > > > >> To: Project Gutenberg >> Subject: Re: Your Project Gutenberg CD/DVD request >> >> askim! uf ben de daha gecen hafta gutenberg'de geziniyodum. bu maili >> alinca anlayamadim o yuzden :) acaba yanlis bi yere mi bastim da boyle >> oldu diye merak icinde hatirlamaya calisirken 'kaya'dan seda'ya' >> notunu gordum! :) >> > > -- > E-mail: cannona@fireantproductions.com > Skype: cannona > MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail > address.) > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From cannona at fireantproductions.com Tue Jan 25 17:09:22 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Tue Jan 25 17:09:44 2005 Subject: [gutvol-d] anyone wanna' translate? In-Reply-To: <6.2.0.14.2.20050125162708.03437ff0@zimmer.csufresno.edu> References: <6.1.2.0.0.20050125180101.01f68a00@mail.fireantproductions.com> <6.2.0.14.2.20050125162708.03437ff0@zimmer.csufresno.edu> Message-ID: <6.1.2.0.0.20050125190757.01eb3080@mail.fireantproductions.com> Thanks for the translation. She did more than push a button. She typed in her e-mail and mailing address. Oh well. She'll get the discs anyway. Perhaps she forgot she requested them. Thanks again. Sincerely Aaron Cannon At 06:32 PM 1/25/2005, you wrote: >At 04:03 PM 1/25/2005, you wrote: > >>Received this message at the cd@pglaf.org address. Anyone care to translate? >> >> >> >> >>>To: Project Gutenberg >>>Subject: Re: Your Project Gutenberg CD/DVD request >>> >>>askim! uf ben de daha gecen hafta gutenberg'de geziniyodum. bu maili >>>alinca anlayamadim o yuzden :) acaba yanlis bi yere mi bastim da boyle >>>oldu diye merak icinde hatirlamaya calisirken 'kaya'dan seda'ya' >>>notunu gordum! :) > >My love! oof , I was strolling in gutenberg just last week. That's why I >could not understand when I received this email. :) I was wondering if I >pushed something by mistake and that's why it happened. Trying to remember >with wonder, I saw your note of 'from Kaya to Seda.' > >The original note is in Turkish. I apologize to the sender of this note >for outing her but we all need to be more careful of which buttons we push!! > >Alev. > > > >>-- >>E-mail: cannona@fireantproductions.com >>Skype: cannona >>MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail >>address.) >> >>_______________________________________________ >>gutvol-d mailing list >>gutvol-d@lists.pglaf.org >>http://lists.pglaf.org/listinfo.cgi/gutvol-d >> >> >> >>-- >>No virus found in this incoming message. >>Checked by AVG Anti-Virus. >>Version: 7.0.300 / Virus Database: 265.7.4 - Release Date: 1/25/2005 > > >-- >No virus found in this outgoing message. >Checked by AVG Anti-Virus. >Version: 7.0.300 / Virus Database: 265.7.4 - Release Date: 1/25/2005 > > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From traverso at dm.unipi.it Wed Jan 26 02:30:54 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Wed Jan 26 02:25:14 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <20050125214122.59C804BE64@ws1-1.us4.outblaze.com> (shalesller@writeme.com) References: <20050125214122.59C804BE64@ws1-1.us4.outblaze.com> Message-ID: <200501261030.j0QAUsS29707@posso.dm.unipi.it> >>>>> "D" == D Starner writes: D> "Michael Hart" writes: >> As for the recent request to delete Shakespeare #100, I never >> heard back from my reply, so I never got to the bottom line >> reasoning behind that request. Perhaps it was only because of >> copyright, or because it was such an early effort that it >> needed proofing to bring it up to today's standards. D> I sent a message to gutvol-d about it earlier. It's a D> copyrighted edition of a public domain text, and there's D> nothing in that file to indicate that the edition was first D> published in 1931 and isn't just an electronic copy of a public D> domain edition. And even if it is a 1931 edition, the copyright D> notices are wrong; electronic editions of texts do not get a D> new copyright. In many ways, it's the epitome of stuff PG D> opposes. Including the original publication date for example would classify this edition as public domain in Italy (where critical editions get 20 years of copyright) and in many other countries. Carlo Traverso From scottsch at ncweb.com Wed Jan 26 07:39:52 2005 From: scottsch at ncweb.com (Scott Schmucker) Date: Wed Jan 26 07:40:00 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <20050125162927.29099109970@ws6-4.us4.outblaze.com> References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> Message-ID: <41F7B9C8.2010202@ncweb.com> I would agree that choosing 1991 as a starting point for a Moore's Law relation on Project Gutenberg production is arbitrary. At the same time, however, let's think about using this same logic for the semiconductor industry. Rather than choosing some arbitrary starting point for our measurements, let's go straight back to the beginning. The first germanium transistor was invented in 1947, and it had a minimum feature size of 0.002 inches, or about 5.08x10^-5 meters. According to Moore's law, this feature size should reduce by half roughly every 18 months, correct? By that prediction, the feature size should have since been reduced by half 38.667 times. That would give us a minimum feature size in today's transistors of 1.16x10^-16 meters, or 0.000001 Angstroms. Of course, the diameter of an atom is on the scale of Angstoms, meaning that under Moore's Law we should now be fitting nearly one million transistors within a single atom...needless to say we are nowhere close. Therefore, the semiconductor industry is only keeping up with Moore's Law in as much as Project Gutenberg is. I didn't take the time to compare other features of the first semiconductor transistor (or the first integrated circuit), as I had a bit more trouble locating other data, but I suspect the result would be the same. The point, of course, is that Moore's Law serves to indicate an exponential growth rate over whatever time frame you choose to apply it. Regardless of where it starts, it gives a pretty good idea of this. Do I think that it is particularly applicable to Project Gutenberg production? No, not really. But it is a term that the public is familiar with, and thus Michael finds that it is a useful term to use in advertising just how much work PG has been doing over the past decade (and a half). Can we really think in terms of "keeping up with Moore's Law?" Probably not, it doesn't make much sense, though the semiconductor industry does it every day, and going back to the very beginning, they are much farther away from Moore's Law than we are. - Scott Schmucker Joshua Hutchinson wrote: >*sigh* > >The only thing your chart proves is that applying Moore's Law to PG production is a waste of time. > >1991 is NOT when production started. It is not when PG started. It is just the point in time you chose as the starting point to attach Moore's Law. > >What people are saying is this: > >MOORE'S LAW DOES NOT APPLY TO PG PRODUCTION. WE WOULD BE BETTER OFF JUST ANNOUNCING WE HAVE "X" NUMBER OF TEXTS INSTEAD OF COMPARING IT TO A "LAW" THAT ISN'T EVEN MEANT TO BE USED THIS WAY. > >Admit it, Michael, 1991 is just an arbritrary date. Just because "regular production" started in 1991 ... bull. We had regular, once a year, production before that. > >Josh > >PS BTW, if you are going to say no one "read the message above," you might want to actually quote what we supposedly didn't read. As it is, I'm assuming we all read the same thing as you, but had a slightly different level of reading comprehension, since we all seemed to get something completely different from it than you did. > > >----- Original Message ----- >From: "Michael Hart" >To: "The gutvol-d Mailing List" >Subject: [gutvol-d] Addition to PG History PT1 >Date: Tue, 25 Jan 2005 07:34:58 -0800 (PST) > > > >>Well, it would appear those in question did not read the message >>above, at least they did not reply to any of the points made, so >>I am duty bound to explain in a bit more detail, as to how there >>are no other choices other than Project Gutenberg's first effort >>at a regular production schedule in 1991 as a starting point. >> >>Here are various starting years and results for Moore's Law used >>to project the growth of Project Gutenberg. >> >> >>START TOTAL START TOTAL START TOTAL START TOTAL >>YEAR NUMBER YEAR NUMBER YEAR NUMBER YEAR NUMBER ACTUAL >>1971 BASE 1971 1979 BASE 1979 1990 BASE 1990 1993 BASE 1993 NUMBER/YR >> >>1971 1 1 1971 >>1974 4 4 1974 >>1977 16 7 1977 >>1980 64 1979 9 9 1980 >>1983 256 1982 36 9 1983 >>1986 1024 1985 144 10 1986 >>1989 4096 1988 576 1990 10 10 1989 >>1992 16384 1991 2304 1993 40 1993 100 42 1992 >>1995 65536 1994 9216 1996 160 1996 400 365 1995 >>1998 262144 1997 36864 1999 640 1999 1600 1550 1998 >>2001 1480576 2000 147456 2002 2560 2002 25600 4260 2001 >>2004 4194304 2003 589824 2005 10240 2005 102400 14944 2004 >>2007 16777216 2006 2359296 2008 40960 2008 409600 ????? 2007 >> >>As you can plainly see, starting Moore's Law at any other date than ~1991 is >>inappropriate not only because the numbers make it obvious, but also because >>1991 was the first year of regular production growth for Project Gutenberg. >> >> >>_______________________________________________ >>gutvol-d mailing list >>gutvol-d@lists.pglaf.org >>http://lists.pglaf.org/listinfo.cgi/gutvol-d >> >> > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > From jon at noring.name Wed Jan 26 07:47:43 2005 From: jon at noring.name (Jon Noring) Date: Wed Jan 26 07:47:50 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <41F7B9C8.2010202@ncweb.com> References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> <41F7B9C8.2010202@ncweb.com> Message-ID: <782736515.20050126084743@noring.name> Scott Schmucker wrote: > The point, of course, is that Moore's Law serves to indicate an > exponential growth rate over whatever time frame you choose to apply > it. Regardless of where it starts, it gives a pretty good idea of > this. Do I think that it is particularly applicable to Project > Gutenberg production? No, not really. But it is a term that the public > is familiar with, and thus Michael finds that it is a useful term to use > in advertising just how much work PG has been doing over the past decade > (and a half). Can we really think in terms of "keeping up with Moore's > Law?" Probably not, it doesn't make much sense, though the > semiconductor industry does it every day, and going back to the very > beginning, they are much farther away from Moore's Law than we are. In the few times I tried to apply Moore's Law to something other than CPU advancement, I've used the phrase "Moore's Law-like." However, I think we should use the term "exponential" or "geometric" (whatever best applies) to describe the growth rate of PG texts. Jon From marcello at perathoner.de Wed Jan 26 10:05:35 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Jan 26 11:55:28 2005 Subject: [gutvol-d] New canonical URL for etext directories In-Reply-To: <41F6CBEC.8020802@adelaide.edu.au> References: <41F56061.4060501@perathoner.de> <41F6CBEC.8020802@adelaide.edu.au> Message-ID: <41F7DBEF.1000003@perathoner.de> Steve Thomas wrote: > I see that you are also redirecting old-syle files, e.g. > > http://www.gutenberg.org/etext90/getty11h.htm > > is redirected to > > http://www.gutenberg.org/dirs/etext90/getty11h.htm > > Would it not be better to redirect these to the catalog, e.g. > > http://www.gutenberg.org/etext90/getty* > > could redirect to > > http://www.gutenberg.org/etext/4 > > That would ensure that anyone using old-style links would get directed > to the latest version(s) of a work. Something like that is already in place experimentally: If a deep link to a file is posted on a web page, I redirect to the bibrec page instead. Some people think this is a bad idea, though. -- Marcello Perathoner webmaster@gutenberg.org From cannona at fireantproductions.com Wed Jan 26 12:32:02 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Wed Jan 26 12:32:19 2005 Subject: [gutvol-d] PG DVDs Message-ID: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> I received this message a few days ago. I sent it to Greg for his opinion, but either my messages aren't getting through, or he's been busy, as I haven't received a response. Anyway, I thought I'd forward it to the list for discussion. Personally I don't see that there would be a problem, as long as the DVDs are given as gifts, and not sold. Nevertheless, what do the rest of you think? "Hi Project Gutenberg folks, First of all, thanks for all the wonderful work you do. I appreciate it so much. I had a question about using Project Gutenberg's CDs or DVDs as "free" gifts in a membership drive for an advocacy group. I'm doing some work for Public Knowledge (a DC based advocacy group that lobbies for the public interest in intellectual property issues-- I'm sure you've crossed paths with them), and I'm helping them improve their online outreach and fundraising. One of the things we'd like to do is let people show their support by becoming members. We'd like to offer people who donate gifts such as Public Knowledge t-shirts, or books like Lessig's "Free Culture", and we'd like for some of these items to be a celebration of Creative Commons licenses and the public domain. Along those lines, we're interested in giving members Project Gutenberg DVDs. Would that be possible? Are there any rights issues? And what would be the best way to get a hundred or so copies that we could send out? If there's a suggested donation when copies of the CD are made, what kind of donation would you consider to be fair? Thanks for you time, and I look forward to hearing from you, -Holmes Wilson" Sincerely Aaron Cannon -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From jlinden at projectgutenberg.ca Wed Jan 26 12:46:24 2005 From: jlinden at projectgutenberg.ca (James Linden) Date: Wed Jan 26 12:49:25 2005 Subject: [gutvol-d] PG DVDs In-Reply-To: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> Message-ID: <41F801A0.9090502@projectgutenberg.ca> I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't making any money off them per se, so I don't see a conflict of interest, etc. But heck, that's just my $0.02 CAD! -- James jlinden@projectgutenberg.ca PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send you. Aaron Cannon wrote: > I received this message a few days ago. I sent it to Greg for his > opinion, but either my messages aren't getting through, or he's been > busy, as I haven't received a response. > > Anyway, I thought I'd forward it to the list for discussion. Personally > I don't see that there would be a problem, as long as the DVDs are given > as gifts, and not sold. Nevertheless, what do the rest of you think? > > > "Hi Project Gutenberg folks, > > First of all, thanks for all the wonderful work you do. I appreciate > it so much. > > I had a question about using Project Gutenberg's CDs or DVDs as "free" > gifts in a membership drive for an advocacy group. I'm doing some > work for Public Knowledge (a DC based advocacy group that lobbies for > the public interest in intellectual property issues-- I'm sure you've > crossed paths with them), and I'm helping them improve their online > outreach and fundraising. > > One of the things we'd like to do is let people show their support by > becoming members. We'd like to offer people who donate gifts such as > Public Knowledge t-shirts, or books like Lessig's "Free Culture", and > we'd like for some of these items to be a celebration of Creative > Commons licenses and the public domain. Along those lines, we're > interested in giving members Project Gutenberg DVDs. > > Would that be possible? Are there any rights issues? And what would > be the best way to get a hundred or so copies that we could send out? > If there's a suggested donation when copies of the CD are made, what > kind of donation would you consider to be fair? > > Thanks for you time, and I look forward to hearing from you, > > -Holmes Wilson" From krooger at debian.org Wed Jan 26 13:01:55 2005 From: krooger at debian.org (Jonathan Walther) Date: Wed Jan 26 13:02:09 2005 Subject: [gutvol-d] how long to distribute dp project management? Message-ID: <20050126210155.GA8093@reactor-core.org> Joshua mentioned that the final stage of preparing a book, after the scanning, OCR, and proofing are done, will also be distributed. This is fantastic news. Is there any timeframe for that? I have high quality scans of some extremely RARE books, which have been out of print for more than 100 years, and would like to see them in Project Gutenberg. I plan to upload them soon. Many of the scans are from copies that are more than 300 years old, with all that entails in regard to fonts and typography. Hope PG doesn't choke on them. Jonathan -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! Eukleia: Jonathan Walther Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) Contact: 604-582-9308 (between 7am and 11pm, PST) Website: http://reactor-core.org/ Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery It's not true unless it makes you laugh, but you don't understand it until it makes you weep. From hart at pglaf.org Wed Jan 26 13:14:50 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 26 13:14:51 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <41F7B9C8.2010202@ncweb.com> References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> <41F7B9C8.2010202@ncweb.com> Message-ID: On Wed, 26 Jan 2005, Scott Schmucker wrote: > > I would agree that choosing 1991 as a starting point for a Moore's Law > relation on Project Gutenberg production is arbitrary. Arbitary refers to the choice of something without any particular reason for the decision and there are two very non-arbitary reasons for choosing 1991 that have been mentioned quite thoroughly without being refuted by any better model. Once other models were shown to have been chosen to reflect points from which large deviations from both our goals and from reality were predicted [i.e. proven to be false] the argument changed from picking a different year to picking no year at all. [snipped semiconductor history] > The point, of course, is that Moore's Law serves to indicate an exponential > growth rate over whatever time frame you choose to apply it. Regardless of > where it starts, it gives a pretty good idea of this. Do I think that it is > particularly applicable to Project Gutenberg production? No, not really. > But it is a term that the public is familiar with, and thus Michael finds > that it is a useful term to use in advertising just how much work PG has been > doing over the past decade (and a half). Can we really think in terms of > "keeping up with Moore's Law?" Probably not, it doesn't make much sense, > though the semiconductor industry does it every day, and going back to the > very beginning, they are much farther away from Moore's Law than we are. So. . .when it comes down to it, Moore's Law applies better in our case than in the case of semiconductors. However, I should again point out that here it is just being used as a goal and a reference point, and has been for 15 years now. It would appeare that as long as we were always well ahead of Moore's Law for each of our projections, that no one was going to complain a lot, but now that a year has been reported that fell short, even of the previous year, much less of the previous 18 months, then the noise level increases. If there were truly an aversion to using Moore's Law for these purposes, this aversion would likely have been brought up quite often from the 1991 to the present. Of course, you are welcome to apply any techniques to your own history of Project Gutenberg, and run them up our flagpole for testing. Michael From gbnewby at pglaf.org Wed Jan 26 13:18:25 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Wed Jan 26 13:18:27 2005 Subject: [gutvol-d] PG DVDs In-Reply-To: <41F801A0.9090502@projectgutenberg.ca> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> Message-ID: <20050126211825.GA11866@pglaf.org> On Wed, Jan 26, 2005 at 03:46:24PM -0500, James Linden wrote: > I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't > making any money off them per se, so I don't see a conflict of interest, > etc. > > But heck, that's just my $0.02 CAD! > > -- James > jlinden@projectgutenberg.ca Sorry I missed the earlier note. Yes, I agree with James. We probably need to get some more specific language into a CD/DVD license file. The "small print" really only applies to resale where profit is involved, but we get a lot of requests that don't quite fit. It's always fine to send such requests to me and/or Michael. -- Greg > PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send you. > > Aaron Cannon wrote: > >I received this message a few days ago. I sent it to Greg for his > >opinion, but either my messages aren't getting through, or he's been > >busy, as I haven't received a response. > > > >Anyway, I thought I'd forward it to the list for discussion. Personally > >I don't see that there would be a problem, as long as the DVDs are given > >as gifts, and not sold. Nevertheless, what do the rest of you think? > > > > > >"Hi Project Gutenberg folks, > > > >First of all, thanks for all the wonderful work you do. I appreciate > >it so much. > > > >I had a question about using Project Gutenberg's CDs or DVDs as "free" > >gifts in a membership drive for an advocacy group. I'm doing some > >work for Public Knowledge (a DC based advocacy group that lobbies for > >the public interest in intellectual property issues-- I'm sure you've > >crossed paths with them), and I'm helping them improve their online > >outreach and fundraising. > > > >One of the things we'd like to do is let people show their support by > >becoming members. We'd like to offer people who donate gifts such as > >Public Knowledge t-shirts, or books like Lessig's "Free Culture", and > >we'd like for some of these items to be a celebration of Creative > >Commons licenses and the public domain. Along those lines, we're > >interested in giving members Project Gutenberg DVDs. > > > >Would that be possible? Are there any rights issues? And what would > >be the best way to get a hundred or so copies that we could send out? > >If there's a suggested donation when copies of the CD are made, what > >kind of donation would you consider to be fair? > > > >Thanks for you time, and I look forward to hearing from you, > > > >-Holmes Wilson" > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From joshua at hutchinson.net Wed Jan 26 13:22:13 2005 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Wed Jan 26 13:22:22 2005 Subject: [gutvol-d] Addition to PG History PT1 Message-ID: <20050126212213.462F3EDEA1@ws6-1.us4.outblaze.com> ----- Original Message ----- From: "Michael Hart" > > Of course, you are welcome to apply any techniques to your own history > of Project Gutenberg, and run them up our flagpole for testing. > The biggest problem I have with your posts lately, Michael, is that you "run something up the flagpole" and then when a bunch of people start trying to shoot it down, you say, "Well, you never complained before, so I'm going to ignore you now." > If there were truly an aversion to using Moore's Law for these purposes, > this aversion would likely have been brought up quite often from the 1991 > to the present. So either the PG group at large can have an opinion on your "flagpole testing" or we do everything as status-quo. Can't have it both ways. Josh From hart at pglaf.org Wed Jan 26 13:26:56 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 26 13:26:56 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <20050125162927.29099109970@ws6-4.us4.outblaze.com> References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> Message-ID: > Joshua Hutchinson wrote: > >> *sigh* >> >> The only thing your chart proves is that applying Moore's Law to PG >> production is a waste of time. >From the apparent perspective of your comments, this entire discussion would be considered a waste of time, so I'm wondering at the point of all you have said here. >> 1991 is NOT when production started. Note the elimination of the word "regular" from "regular production." This sort of misquoting is hard on the reputation of the speaker. >> It is not when PG started. It is when PG first started a goal of regularly increasing production. >> It is just the point in time you chose as the >> starting point to attach Moore's Law. It is the starting point of attaching ANY kind of predictive PG goals. Moore's Law just happened to be handy, and to fit with what I though Project Gutenberg could do in the future. All in all, it's been remarkable how well starting Moore's Law from 1991 has worked. I keep asking for a better model of prediction, and having to spend my time proving how inelegantly the suggested years work out when the equations are actually moved from their elegant non-numerical form into real numbers that are obviously out of the realm of reality. >> >> What people are saying is this: >> >> MOORE'S LAW DOES NOT APPLY TO PG PRODUCTION. Then why does it fit with reality so much better than anything else? >> WE WOULD BE BETTER OFF JUST ANNOUNCING WE HAVE "X" NUMBER OF TEXTS INSTEAD >> OF COMPARING IT TO A "LAW" THAT ISN'T EVEN MEANT TO BE USED THIS WAY. There have been many technical reports filed on who should be able to use Moore's Law, in which manners it should be used, what fields it should and should not be allowed to be used in. . . but the reality of the situation is that Moore's Law has been out of the realm of the technical experts for most of its history, and you can't just stuff it back in the bottle. >> Admit it, Michael, 1991 is just an arbritrary date. Again I refer you to the opening definitions of arbitrary. >> Just because "regular production" started in 1991 ... bull. Ah, now you use the accurate quotation, and have lost your language skills. >> We had regular, once a year, production before that. If we take your statement of half truth at face value, by your count 1971 to 1990 would then yield 20 entries. Again, focusing only oh a half truth, leaving out the rest, digs you further into a hole. 9 years of "regular, once a year, production before that" followed by 11 years in which only one number was added, includes both halves of your truth, and also indicates why 1991 is the first year from which to make projections. At least projections that are not linear, at best. * However, the real point is that all this information has been presented before, then reflected back in distortion, which I have taken time to very politely refute, time and time again, without resorting to attacking the person and not the ideas presented, and suggesting how the argument might be better made to successfully make your points. Obvious attempts have been made to make this personal, which have been ignored, and the most obvious attempt is to remove any measuring stick for our progress. As with the failed suggested models of starting 1971 and 1993 as the best baselines for Moore's Law, the idea of removing any such yardstick at all is merely a ploy to remove any objective measurement standard. Whether we meet such a standard, exceed it, or even fall short of it, it is always handy to have such a standard so we know something about where are are, where we came from, and where we are going. Michael From hart at pglaf.org Wed Jan 26 14:12:44 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Jan 26 14:12:45 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <20050126212213.462F3EDEA1@ws6-1.us4.outblaze.com> References: <20050126212213.462F3EDEA1@ws6-1.us4.outblaze.com> Message-ID: On Wed, 26 Jan 2005, Joshua Hutchinson wrote: > > ----- Original Message ----- > From: "Michael Hart" >> >> Of course, you are welcome to apply any techniques to your own history >> of Project Gutenberg, and run them up our flagpole for testing. >> > > The biggest problem I have with your posts lately, Michael, is that you "run > something up the flagpole" and then when a bunch of people start trying to > shoot it down, you say, "Well, you never complained before, so I'm going to > ignore you now." > >> If there were truly an aversion to using Moore's Law for these purposes, >> this aversion would likely have been brought up quite often from the 1991 >> to the present. > > So either the PG group at large can have an opinion on your "flagpole > testing" or we do everything as status-quo. Can't have it both ways. > > Josh > In this case, the suggested dates were tested and found obviously wanting. Reasons were given, which were ignored and/or misquoted. After the refutation became so obvious it could not be ingored, the message was changed from changed from "let's do this idea" to "let's have no idea." As with the failed suggested models of starting 1971 and 1993 as the best baselines for Moore's Law, the ideal of removing any such yardstick at all is merely a continuing ploy to get us to remove any objective measurement standard. First it was to change a standard in use since 1991 to 1971, then it was to change it to 1993, now it is to destroy such standards altogether. If you want to live without any such standards, that is up to you, totally up to you; but you aren't going to force it on me, or get me to help you force it on others. Whether we meet such a standard, exceed it, or even fall short of such a standard, it is always best to have such a standard, so we know something about where are, where we came from, thus where we are going. Michael From shimmin at uiuc.edu Wed Jan 26 14:17:44 2005 From: shimmin at uiuc.edu (Robert Shimmin) Date: Wed Jan 26 14:17:57 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> Message-ID: <41F81708.8020507@uiuc.edu> Before December 1990, PG's collection size did not grow exponentially. Between December 1990 and August 1997, the collection size grew exponentially with a 12-month doubling time. Between August 1997 and August 2000, the collection size grew exponentially with a 27-month doubling time. Between August 2000 and December 2004, the collection size grew exponentialy with a 20-month doubling time. On average, between December 1990 and December 2004, the collection size grew exponentially with a 16-month doubling time. 1991 is a fine year to begin exponential fitting, but our present rate of exponential growth is somewhat less than it has been historically. -- RS From marcello at perathoner.de Wed Jan 26 14:19:31 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Wed Jan 26 14:32:32 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> <41F7B9C8.2010202@ncweb.com> Message-ID: <41F81773.1080003@perathoner.de> Michael Hart wrote: >> I would agree that choosing 1991 as a starting point for a Moore's Law >> relation on Project Gutenberg production is arbitrary. > > Arbitary refers to the choice of something without any particular > reason for the decision and there are two very non-arbitary reasons > for choosing 1991 that have been mentioned quite thoroughly without > being refuted by any better model. The one reason being that any other year was proven not to work. The other reason I forgot. I'm against "Laws" that work only on fridays with full moon. -- Marcello Perathoner webmaster@gutenberg.org From cannona at fireantproductions.com Wed Jan 26 14:36:50 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Wed Jan 26 14:37:52 2005 Subject: [gutvol-d] PG DVDs In-Reply-To: <20050126211825.GA11866@pglaf.org> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> <20050126211825.GA11866@pglaf.org> Message-ID: <6.1.2.0.0.20050126163350.01c6fe80@mail.fireantproductions.com> I will let them know. Thanks. Also, can you confirm whether my messages are or are not getting through to you? I originally sent this E-mail to you direct, but I didn't get a response. If it's simply that you've been busy, then that's fine. I just want to make sure they're not getting lost for some reason. Thanks. Sincerely Aaron Cannon At 03:18 PM 1/26/2005, you wrote: >Sorry I missed the earlier note. Yes, I agree with James. >We probably need to get some more specific language into >a CD/DVD license file. The "small print" really only applies >to resale where profit is involved, but we get a lot >of requests that don't quite fit. It's always fine to send >such requests to me and/or Michael. > -- Greg > > > PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send > you. > > > > Aaron Cannon wrote: > > >I received this message a few days ago. I sent it to Greg for his > > >opinion, but either my messages aren't getting through, or he's been > > >busy, as I haven't received a response. > > > > > >Anyway, I thought I'd forward it to the list for discussion. Personally > > >I don't see that there would be a problem, as long as the DVDs are given > > >as gifts, and not sold. Nevertheless, what do the rest of you think? > > > > > > > > >"Hi Project Gutenberg folks, > > > > > >First of all, thanks for all the wonderful work you do. I appreciate > > >it so much. > > > > > >I had a question about using Project Gutenberg's CDs or DVDs as "free" > > >gifts in a membership drive for an advocacy group. I'm doing some > > >work for Public Knowledge (a DC based advocacy group that lobbies for > > >the public interest in intellectual property issues-- I'm sure you've > > >crossed paths with them), and I'm helping them improve their online > > >outreach and fundraising. > > > > > >One of the things we'd like to do is let people show their support by > > >becoming members. We'd like to offer people who donate gifts such as > > >Public Knowledge t-shirts, or books like Lessig's "Free Culture", and > > >we'd like for some of these items to be a celebration of Creative > > >Commons licenses and the public domain. Along those lines, we're > > >interested in giving members Project Gutenberg DVDs. > > > > > >Would that be possible? Are there any rights issues? And what would > > >be the best way to get a hundred or so copies that we could send out? > > >If there's a suggested donation when copies of the CD are made, what > > >kind of donation would you consider to be fair? > > > > > >Thanks for you time, and I look forward to hearing from you, > > > > > >-Holmes Wilson" > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From jmdyck at ibiblio.org Wed Jan 26 15:36:16 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Wed Jan 26 15:36:55 2005 Subject: [gutvol-d] how long to distribute dp project management? References: <20050126210155.GA8093@reactor-core.org> Message-ID: <41F82970.E6AA87D6@ibiblio.org> Jonathan Walther wrote: > > Joshua mentioned that the final stage of preparing a book, after the > scanning, OCR, and proofing are done, will also be distributed. This > is fantastic news. Is there any timeframe for that? Nope. It could happen next week (though that's very unlikely), or next year, or never. Development of the DP code is done by volunteers, and not very many of them, so it's difficult to predict when (or even if) any particular advance will occur. -Michael Dyck From krooger at debian.org Wed Jan 26 15:36:59 2005 From: krooger at debian.org (Jonathan Walther) Date: Wed Jan 26 15:37:12 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <20050126212213.462F3EDEA1@ws6-1.us4.outblaze.com> References: <20050126212213.462F3EDEA1@ws6-1.us4.outblaze.com> Message-ID: <20050126233659.GA7560@reactor-core.org> On Wed, Jan 26, 2005 at 04:22:13PM -0500, Joshua Hutchinson wrote: >> If there were truly an aversion to using Moore's Law for these >> purposes, this aversion would likely have been brought up quite often >> from the 1991 to the present. > >So either the PG group at large can have an opinion on your "flagpole >testing" or we do everything as status-quo. Can't have it both ways. All debate aside, Moores Law seems like a really stupid thing to argue about. :-) Jonathan -- Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! Eukleia: Jonathan Walther Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) Contact: 604-582-9308 (between 7am and 11pm, PST) Website: http://reactor-core.org/ Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery It's not true unless it makes you laugh, but you don't understand it until it makes you weep. From kthagen at polysyllabic.com Wed Jan 26 15:42:43 2005 From: kthagen at polysyllabic.com (Karl Hagen) Date: Wed Jan 26 15:42:34 2005 Subject: [gutvol-d] Orphaned Copyrights Message-ID: <41F82AF3.5040000@polysyllabic.com> (Originally sent this from the wrong e-mail address, so apologies if it appears twice.) On BoingBoing, I ran across a link to this notice of inquiry, whose importance to PG seems glaringly obvious: SUMMARY: The Copyright Office seeks to examine the issues raised by ``orphan works,'' i.e., copyrighted works whose owners are difficult or even impossible to locate. Concerns have been raised that the uncertainty surrounding ownership of such works might needlessly discourage subsequent creators and users from incorporating such works in new creative efforts or making such works available to the public. This notice requests written comments from all interested parties. Specifically, the Office is seeking comments on whether there are compelling concerns raised by orphan works that merit a legislative, regulatory or other solution, and what type of solution could effectively address these concerns without conflicting with the legitimate interests of authors and right holders. Full document at http://a257.g.akamaitech.net/7/257/2422/01jan20051800/edocket.access.gpo.gov/2005/05-1434.htm From j.hagerson at comcast.net Wed Jan 26 16:03:44 2005 From: j.hagerson at comcast.net (John Hagerson) Date: Wed Jan 26 16:04:05 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <41F81773.1080003@perathoner.de> Message-ID: <000801c50403$ae438240$6401a8c0@sarek> #caps lock ON OK. THAT'S ENOUGH ALREADY. WE GET THE POINT. LET'S GIVE THIS TOPIC A REST. NO MORE DISCUSSION OF MOORE'S LAW AND PG. PLEASE.................. #caps lock OFF From jmdyck at ibiblio.org Wed Jan 26 16:28:54 2005 From: jmdyck at ibiblio.org (Michael Dyck) Date: Wed Jan 26 16:32:57 2005 Subject: [gutvol-d] Addition to PG History PT1 References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> Message-ID: <41F835C5.11002B8E@ibiblio.org> Michael Hart wrote: > > I keep asking for a better model of prediction, If you want a 'doubling every 18 months' curve, then using a reference point of 100 in 1993 actually provides a much better fit to the (post-1990) data than does 10 in 1990. > ... the failed suggested models of starting 1971 > and 1993 as the best baselines for Moore's Law, ... Re the "failure" of the 1993 model, in yesterday's posting: http://lists.pglaf.org/private.cgi/gutvol-d/2005-January/001419.html you gave this data: (omitting the columns that use 1971 and 1979 as reference points) START TOTAL START TOTAL YEAR NUMBER YEAR NUMBER ACTUAL 1990 BASE 1990 1993 BASE 1993 NUMBER/YR 1 1971 4 1974 7 1977 9 1980 9 1983 10 1986 1990 10 10 1989 1993 40 1993 100 42 1992 1996 160 1996 400 365 1995 1999 640 1999 1600 1550 1998 2002 2560 2002 25600 4260 2001 2005 10240 2005 102400 14944 2004 2008 40960 2008 409600 ????? 2007 However, there's a mistake in the "BASE 1993" column: the numbers after 1600 should be 6400, 25600, and 102400. Moreover, it's easier to make the comparison if we use the same years for the actual numbers as we do for the predicted numbers. With these two changes, I think the table should be something like: ref point = ref point = year '10 in 1990' '100 in 1993' actual 1990 10 25 10 1993 40 100 100 1996 160 400 750 1999 640 1600 2000 2002 2560 6400 6500 2005 10240 25600 20000? 2008 40960 102400 ????? (It should also be clarified that the number for a given year is that of Dec 10th for that year.) As you can see, the actual numbers (after 1990) are closer to the "1993" curve than the "1990" curve. -Michael Dyck ps: In your table, there's also a typo in the "BASE 1971" column: 1480576 should be 1048576. From sly at victoria.tc.ca Thu Jan 27 00:06:32 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jan 27 00:06:54 2005 Subject: [gutvol-d] Orphaned Copyrights In-Reply-To: <41F82AF3.5040000@polysyllabic.com> References: <41F82AF3.5040000@polysyllabic.com> Message-ID: I've got a new nomination for "passage in a PG text most likely to be flagged by a spell checker." >From the just-released Jules Verne title: "La Jangada" (PG #14806) _Phyjslyddqfdzxgasgzzqqehxgkfndrxujugiocytdxvksbxhhuypo hdvyrymhuhpuydkjoxphetozsletnpmvffovpdpajxhyynojyggayme qynfuqlnmvlyfgsuzmqiztlbqgyugsqeubvnrcredgruzblrmxyuhqhp zdrrgcrohepqxufivvrplphonthvddqfhqsntzhhhnfepmqkyuuexktog zgkyuumfvijdqdpzjqsykrplxhxqrymvklohhhotozvdksppsuvjhd._ My first thought in seeing that was "Someone really make a mistake here..." and I was ready to send off an email about it when my small knowledge of French kicked in, and I read the first paragraph and realized "Oh, it's supposed to be like that." Andrew From kouhia at nic.funet.fi Thu Jan 27 02:55:04 2005 From: kouhia at nic.funet.fi (Juhana Sadeharju) Date: Thu Jan 27 02:55:31 2005 Subject: [gutvol-d] Magic books / Image versions Message-ID: >From: "David A. Desrosiers" > > Let me know what you're missing, and I'll try to see if >anything I have might be able to fill in the gaps. Hello. I found the shop again. Perhaps it was a temporary network error when I tried it a year ago. :-( The url is "http://www.lybrary.com/". I have purchased the following set of books (collector's editions): The Art of Modern Conjuring anonymous (?) Later Magic Prof. Hoffmann (1911) Magic Ellis Stanyon (1901) Magicians Tricks Henry Hatton and Adrian Plate (1917) Modern Magic Prof. Hoffmann (1876) More Magic Prof. Hoffmann (1890) Our Magic Nevil Maskelyne and David Devant (1911) Sleight of Hand Edwin Sachs (1885) The Art of Magic T. Nelson Downs (1921) The Magicians Handbook P. T. Selbit (1901) The Modern Conjurer C. Lang Neil (1902) But now the most important: how we could get magic books to the PG collection? Lybrary is doing good in making these books available for a relatively cheap price, and convenietly in electronic format. But "competition" never makes harm. Could volunteers here check if the above books are available? The books I have are in one-image-per-page format but I would like to have better quality images (for readability). The text versions could be produced later if people want them; I prefer the original look available in the image versions. Image versions saves much time too. For example, check "http://theses.mit.edu" for good quality image versions of theses. They would have never made them if they would have choosen the same text approach as PG. We too should prefer image version as a first version -- as a fastest version to produce. A digital camera would be fastest way to digitize the books, I'm sure. But how to match the people who have access to these books, the people who have digital camera (or scanner), and the people (like me) who have time and interest to digitize? I have a scanner if somebody would be nice enough to send me books for the digitization. But for the scanner, I would have to lift up, turn, and slightly press flat the books at every page. The 1890 math books I scanned were not harmed in the process, but the scanner is not best apparatus for digitization in this context. Juhana -- http://music.columbia.edu/mailman/listinfo/linux-graphics-dev for developers of open source graphics software From nwolcott at dsdial.net Thu Jan 27 07:44:28 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Thu Jan 27 08:05:52 2005 Subject: [gutvol-d] etext 1842 Michael Sstrogoff References: <20050125163422.B99204F4BD@ws6-5.us4.outblaze.com> Message-ID: <000201c5048a$06033a80$f69495ce@gw98> I think that both txt editions are identical. somehow no 10 got added twice with different boilerplate. Then the html is a version of the same. ----- Original Message ----- From: "Joshua Hutchinson" To: "Project Gutenberg Volunteer Discussion" Sent: Tuesday, January 25, 2005 11:34 AM Subject: Re: [gutvol-d] etext 1842 Michael Sstrogoff > Forgive me, but I'm not real sure what the point of this was. > > strgf10.txt and strgf11.txt are both in the directory listing. However, like all our texts, the latest version of each format is listed in the catalog entry. This is completely normal. > > Is this an errata report on the HTML? You mention the first letter of each chapter. > > If you could make your message a little clearer, I or someone more knowledgable will do our best to figure out the answer. > > Josh > > ----- Original Message ----- > From: "N Wolcott" > To: "Project Gutenberg Volunteer Discussion" > Subject: [gutvol-d] etext 1842 Michael Sstrogoff > Date: Tue, 25 Jan 2005 11:23:13 -0500 > > > > > The catalogue currently indicates an 1842 -10 html and 1842-11 txt. > > Previously I had a 1842-10 txt which no longet appears. These appear to be all > > the same text, although 1842-10.txt and t842-11.txt differ in length by 2K > > bytes (change in header?). The source for all was apparently Judy Boss, and as > > there will be no errors in her text, unless she herself submitted it twice, > > all the texts should be identical. Since the html was converted from Boss's > > text there is the possibility of an error in the conversion. The html > > capitalizes the 1st letter of each chapter, not in the original etext. > > Otherwise a robot conversion. Too bad all the versions do not appear on the > > catalogue list. > > > > N Wolcott nwolcott2@post.harvard.edu > > > > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From nwolcott at dsdial.net Thu Jan 27 07:51:16 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Thu Jan 27 08:05:56 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks References: <20050125214122.59C804BE64@ws1-1.us4.outblaze.com> Message-ID: <000301c5048a$06e50f00$f69495ce@gw98> How about removing Mark Twain? ----- Original Message ----- From: "D. Starner" To: "Michael S. Hart" ; "Project Gutenberg Volunteer Discussion" Sent: Tuesday, January 25, 2005 4:41 PM Subject: Re: !@!Re: [gutvol-d] Moving and Removing eBooks > "Michael Hart" writes: > > > As for the recent request to delete Shakespeare #100, I never heard > > back from my reply, so I never got to the bottom line reasoning > > behind that request. Perhaps it was only because of copyright, > > or because it was such an early effort that it needed proofing > > to bring it up to today's standards. > > I sent a message to gutvol-d about it earlier. It's a copyrighted > edition of a public domain text, and there's nothing in that file > to indicate that the edition was first published in 1931 and isn't > just an electronic copy of a public domain edition. And even if it > is a 1931 edition, the copyright notices are wrong; electronic editions > of texts do not get a new copyright. In many ways, it's the epitome > of stuff PG opposes. > > > There is also great, and worthwhile, concern about the Longfellow > > translation of Dante. However, it is also an example that should > > be preserved as an indication of history, even if we recommend > > the Cary translation or any other as being of better quality. > > Who cares about Dante; it's Longfellow. Different translations, even > different editions are interesting as long as distinguishing marks > are included in the file. > -- > ___________________________________________________________ > Sign-up for Ads Free at Mail.com > http://promo.mail.com/adsfreejump.htm > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From nwolcott at dsdial.net Thu Jan 27 08:20:55 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Thu Jan 27 08:29:31 2005 Subject: [gutvol-d] how long to distribute dp project management? References: <20050126210155.GA8093@reactor-core.org> Message-ID: <00a001c5048d$5371e480$f69495ce@gw98> If you have a a valuable collection, if the scans are high quality tiff's or tiff's and jpegs you might enquire about space on ibiblio where they can be accessed as a collection. Many PG tiff's are just high enought quality to "get the job done", you might want yours to be separated from the dross. ----- Original Message ----- From: "Jonathan Walther" To: Sent: Wednesday, January 26, 2005 4:01 PM Subject: [gutvol-d] how long to distribute dp project management? > Joshua mentioned that the final stage of preparing a book, after the > scanning, OCR, and proofing are done, will also be distributed. This is > fantastic news. Is there any timeframe for that? > > I have high quality scans of some extremely RARE books, which have been > out of print for more than 100 years, and would like to see them in > Project Gutenberg. I plan to upload them soon. Many of the scans are > from copies that are more than 300 years old, with all that entails in > regard to fonts and typography. Hope PG doesn't choke on them. > > Jonathan > > -- > Puritan: Purity of faith, Purity of doctrine. Sola Scriptura! > Eukleia: Jonathan Walther > Address: 12706 99 Ave, Surrey, BC V3V2P8 (Canada) > Contact: 604-582-9308 (between 7am and 11pm, PST) > Website: http://reactor-core.org/ > > Patriarchy, Polygamy, Slavery === Fatherhood, Husbandry, Mastery > Matriarchy, Monogamy, Prisons === Wickedness, Stupidity, Buggery > > It's not true unless it makes you laugh, > but you don't understand it until it makes you weep. > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From sly at victoria.tc.ca Thu Jan 27 10:52:32 2005 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jan 27 10:52:40 2005 Subject: [gutvol-d] etext 1842 Michael Sstrogoff In-Reply-To: <000201c5048a$06033a80$f69495ce@gw98> References: <20050125163422.B99204F4BD@ws6-5.us4.outblaze.com> <000201c5048a$06033a80$f69495ce@gw98> Message-ID: Well, why just speculate? Why not take a look at the texts? I've just downloaded both of them and from a quick look here's what I found: My difference checker choked on comparing the files. Line by line they are evidently very dissimilar. A look at the beginnings of the files shows that edition 11 has an opening quote mark that is missing is edition 10. Edition 11 has much of the text re-wraped, which was a good thing I suppose as edition 10 had lots of words broken at the end of a line (probably preserving same line-endings as in the original paper text). So yes, they are different, and edition 11 more closely matches currant PG formatting standards. (Although, personally, I would still change a few things in it.) Andrew On Thu, 27 Jan 2005, N Wolcott wrote: > I think that both txt editions are identical. somehow no 10 got added twice > with different boilerplate. Then the html is a version of the same. From hart at pglaf.org Thu Jan 27 11:09:57 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 27 11:09:58 2005 Subject: !@!Re: [gutvol-d] PG DVDs In-Reply-To: <41F801A0.9090502@projectgutenberg.ca> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> Message-ID: One thing to remember, if the DVD are used as "premiums" such as is done with PBS/NPR membership drives, then they are legally regarded as being "sold" and the "donation" to PBS/NPR, etc., is no longer a "donation," and thus no longer tax deductible. Thus the "fine print" concerning the "sale" of Project Gutenberg eBooks would legally apply. However: When our DVDs are truly given away free of charge, as has been done by various people at public libraries, the Internet Archive, etc., that is just fine, also as per the "fine print." I am not a lawyer, this is not legal advice, just what I remember from our lawyers and what our own local PBS/NPR stations' disclaimers. Michael On Wed, 26 Jan 2005, James Linden wrote: > I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't making > any money off them per se, so I don't see a conflict of interest, etc. > > But heck, that's just my $0.02 CAD! > > -- James > jlinden@projectgutenberg.ca > > PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send you. > > Aaron Cannon wrote: >> I received this message a few days ago. I sent it to Greg for his >> opinion, but either my messages aren't getting through, or he's been busy, >> as I haven't received a response. >> >> Anyway, I thought I'd forward it to the list for discussion. Personally I >> don't see that there would be a problem, as long as the DVDs are given as >> gifts, and not sold. Nevertheless, what do the rest of you think? >> >> >> "Hi Project Gutenberg folks, >> >> First of all, thanks for all the wonderful work you do. I appreciate >> it so much. >> >> I had a question about using Project Gutenberg's CDs or DVDs as "free" >> gifts in a membership drive for an advocacy group. I'm doing some >> work for Public Knowledge (a DC based advocacy group that lobbies for >> the public interest in intellectual property issues-- I'm sure you've >> crossed paths with them), and I'm helping them improve their online >> outreach and fundraising. >> >> One of the things we'd like to do is let people show their support by >> becoming members. We'd like to offer people who donate gifts such as >> Public Knowledge t-shirts, or books like Lessig's "Free Culture", and >> we'd like for some of these items to be a celebration of Creative >> Commons licenses and the public domain. Along those lines, we're >> interested in giving members Project Gutenberg DVDs. >> >> Would that be possible? Are there any rights issues? And what would >> be the best way to get a hundred or so copies that we could send out? >> If there's a suggested donation when copies of the CD are made, what >> kind of donation would you consider to be fair? >> >> Thanks for you time, and I look forward to hearing from you, >> >> -Holmes Wilson" > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From cannona at fireantproductions.com Thu Jan 27 11:47:21 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Jan 27 11:47:28 2005 Subject: [gutvol-d] some feedback Message-ID: <6.1.2.0.0.20050127130831.01b09cc0@mail.fireantproductions.com> Received the following message and I thought the list might be interested. >Date: Thu, 27 Jan 2005 14:18:03 +0100 >From: "Barbara Reed" >To: cannona@fireantproductions.com >Subject: Fwd: Project Gutenberg CD/DVD >Reply-To: barbara@interpc.fr >User-Agent: Opera M2/7.54 (Win32, build 3865) > > > >------- Message r?exp?di?------- >De: "David Kettlewell" >A: barbara@interpc.fr >Sujet: Project Gutenberg CD/DVD >Date: Wed, 26 Jan 2005 22:09:04 +0100 > >Dear Barbara, > >>This message is just a small note to notify you that it >>has been mailed. > >And this is one to let you know they already arrived - many thanks! > >>We have also included an extra disc for you to give >>away to a friend, family member, library, school, ETC. > >There was just one of each, but that's fine anyway, I can make further >copies: but what about this delightful label with Alice drawing aside >the curtain to reveal the door to the secrets of world literature - is >there artwork for that?! My labels don't usually come out as nicely as >that, are they done in some special way, or do you just do them at home >like the rest of us? > >>To hear about new Project Gutenberg eBooks, or get involved in >>creating new eBooks, visit >>http://www.gutenberg.org > >Yes, I certainly see myself getting involved in various in the future, >it seems such a wonderful goal and the friendly democratic profile >seems too good to be true! > >I didn't find it easy to find out how to send in suggestions - I >suppose people aren't expecting us to send paper letters to Salt Lake >City? For instance, I wonder if you can tell me where to send thoughts >like the following? - or does one send everything to info@pglaf.org?: > >- the letter that came with these discs should really have a sentence >for those who *don't* already have an unzipper on their computers: I >more than one Windows user who doesn't have WinZip nor have a clue >about how to get it or install it, and of course Mac users may not know >that these days StuffIt Expander also handles zipped files, since >earlier versions didn't and zip isn't a usual format for the Mac. >Another detail, if tiny, is that 'Zip' isn't a program, but a format, >rather "compressed using the Zip program" a more usual formulation >would perhaps be "compressed in Zip format" (of course they were >compressed with *a* Zip program, but there are many, not just one, and >the way it was done isn't interesting, just the result because that's >what you have to have software to deal with!) > >- there's an enormous difference in legibility between the two labels, >the DVD label would gain greatly by having the same kind of >solid-colour bar as the CD one has > >- I'm confused and alarmed by the conflict between >a) the simple happy injunction in the letter and the web-pages to share >the files or copy the disc; and >b) the standard 'small print' in every Gutenberg file which says >ominously "Be sure to check the copyright laws for your country before >downloading or redistributing this or any other Project Gutenberg >file"; and the even more ominous "You agree that if you distribute this >etext or a copy of it to anyone, you will indemnify and hold the >Project, its officers, members and agents harmless from all liability, >cost and expense, including legal fees, that arise by reason of your >distribution" > >- I didn't anything about how the copying of CDs and DVs works, for >anyone who might be able to contribute in this way: the reference in >the FAQs points just to the consumer's side of things > >- the index page of the CD says "The eBooks on the CD consist mainly of >text and HTML files, with a few movie files": I could only find one >movie file, landing.avi, and although it's well worth having, I don't >see how anyone could call it an eBook > >- it would be wonderful if there were a way of getting the audio eBooks >in disc collections, for those of us who don't have broad-band it's a >major project to download all those separate files ... > >- one wonders what the attitude of Project Gutenberg is to different >editions, and indeed to history in general! For example, it seems >extremely odd that there is only one version of Aesop's fables, which >contains a very strongly-worded criticism of other peoples' work, with >no mention of when it was made - the latest date I could find was 1864, >so it's evidently after that, but doesn't seem to be a modern one. >Similarly with Chaucer's Canterbury Tales, although there is a plethora >of notes I saw nothing to show when the edition was made, which is an >absolutely central feature of an intelligent reading of anything ... > >::::::::::::::::::::::::: > >I wonder which part of France you're in ... I've had very positive >experiences of Auvergne, Vend?e, Bretagne, and have worked in places >like Calais and Arras, as well as Paris and Marseilles ... > >All best wishes, > >David Kettlewell >formerly professor, Tartu University, Estonia >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >Musica Humana - >Music and Musicology >to educate the whole person >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > >-- >With best wishes > From the lady of the changeable locks, soggy shazadi, duchess of darkness > and damp, Barbara Reed, and all the furry friends. Dalie asks you to > sponsor her in aid of the Brooke Hospital, you can visit at > http://www.thebrooke.org >She is walking from Nantes to Brest in 2005 and wants your money to help >working animals in poor countries. -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From scottsch at ncweb.com Thu Jan 27 11:49:55 2005 From: scottsch at ncweb.com (Scott Schmucker) Date: Thu Jan 27 11:50:02 2005 Subject: !@!Re: [gutvol-d] PG DVDs In-Reply-To: References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> Message-ID: <41F945E3.9030509@ncweb.com> I am neither an accountant nor a lawyer, but my layman's understanding of this was that a donation of this sort would still be tax deductible, less the value of the goods that you received in return. For example, if I donated $100 to an organization and they gave me a stuffed animal valued at $5 in return, then I would be able to deduct $95 of this $100 donation. Is my understanding of this inaccurate? Scott Michael Hart wrote: > > One thing to remember, if the DVD are used as "premiums" such as is done > with PBS/NPR membership drives, then they are legally regarded as being > "sold" and the "donation" to PBS/NPR, etc., is no longer a "donation," > and thus no longer tax deductible. > > Thus the "fine print" concerning the "sale" of Project Gutenberg eBooks > would legally apply. > > However: > > When our DVDs are truly given away free of charge, as has been done > by various people at public libraries, the Internet Archive, etc., > that is just fine, also as per the "fine print." > > I am not a lawyer, this is not legal advice, > just what I remember from our lawyers and what > our own local PBS/NPR stations' disclaimers. > > Michael > > > On Wed, 26 Jan 2005, James Linden wrote: > >> I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't >> making any money off them per se, so I don't see a conflict of >> interest, etc. >> >> But heck, that's just my $0.02 CAD! >> >> -- James >> jlinden@projectgutenberg.ca >> >> PS: Aaron, email your snail mail addy, I have a stack of PG CDs to >> send you. >> >> Aaron Cannon wrote: >> >>> I received this message a few days ago. I sent it to Greg for his >>> opinion, but either my messages aren't getting through, or he's been >>> busy, as I haven't received a response. >>> >>> Anyway, I thought I'd forward it to the list for discussion. >>> Personally I don't see that there would be a problem, as long as the >>> DVDs are given as gifts, and not sold. Nevertheless, what do the >>> rest of you think? >>> >>> >>> "Hi Project Gutenberg folks, >>> >>> First of all, thanks for all the wonderful work you do. I appreciate >>> it so much. >>> >>> I had a question about using Project Gutenberg's CDs or DVDs as "free" >>> gifts in a membership drive for an advocacy group. I'm doing some >>> work for Public Knowledge (a DC based advocacy group that lobbies for >>> the public interest in intellectual property issues-- I'm sure you've >>> crossed paths with them), and I'm helping them improve their online >>> outreach and fundraising. >>> >>> One of the things we'd like to do is let people show their support by >>> becoming members. We'd like to offer people who donate gifts such as >>> Public Knowledge t-shirts, or books like Lessig's "Free Culture", and >>> we'd like for some of these items to be a celebration of Creative >>> Commons licenses and the public domain. Along those lines, we're >>> interested in giving members Project Gutenberg DVDs. >>> >>> Would that be possible? Are there any rights issues? And what would >>> be the best way to get a hundred or so copies that we could send out? >>> If there's a suggested donation when copies of the CD are made, what >>> kind of donation would you consider to be fair? >>> >>> Thanks for you time, and I look forward to hearing from you, >>> >>> -Holmes Wilson" >> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d >> > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > From nwolcott at dsdial.net Thu Jan 27 12:01:18 2005 From: nwolcott at dsdial.net (N Wolcott) Date: Thu Jan 27 12:02:10 2005 Subject: [gutvol-d] million book project Message-ID: <002801c504ab$069d0680$f69495ce@gw98> Here is Chapter I of the Steam House by Jules Verne. I think they have a way to go. I wonder if Google is planning to do better? To be fair the Dejavu look at the rather bad images allows them to be mostly made out, so one could reconstruct most of this page. The first line is "TWO THOUSAND POUNDS FOR A HEAD". The images are like with a 300 mp camera, so not surprizing the poor OCR. I think PG texts are easier to read. I understand the books are being destroyed after scanning: might as well put them right into the dumpster and save all the trouble. CHAPTER L ".nvc,it, The name of the Governor N Wolcott nwolcott2@post.harvard.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050127/2fae23ba/attachment.html From hart at pglaf.org Thu Jan 27 12:40:59 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 27 12:41:00 2005 Subject: [gutvol-d] Addition to PG History PT1 In-Reply-To: <41F835C5.11002B8E@ibiblio.org> References: <20050125162927.29099109970@ws6-4.us4.outblaze.com> <41F835C5.11002B8E@ibiblio.org> Message-ID: Fixed the typos, thanks so much!!! I think I must have gotten a double click when running the "Base 1993". . .thanks for catching that. . .been working too hard lately, or [hopefully] I would see those errors myself. Will work on other year models as well. More thanks!!! Michael On Wed, 26 Jan 2005, Michael Dyck wrote: > Michael Hart wrote: >> >> I keep asking for a better model of prediction, > > If you want a 'doubling every 18 months' curve, then using a > reference point of 100 in 1993 actually provides a much better > fit to the (post-1990) data than does 10 in 1990. > >> ... the failed suggested models of starting 1971 >> and 1993 as the best baselines for Moore's Law, ... > > Re the "failure" of the 1993 model, in yesterday's posting: > http://lists.pglaf.org/private.cgi/gutvol-d/2005-January/001419.html > you gave this data: (omitting the columns that use 1971 and 1979 > as reference points) > > START TOTAL START TOTAL > YEAR NUMBER YEAR NUMBER ACTUAL > 1990 BASE 1990 1993 BASE 1993 NUMBER/YR > 1 1971 > 4 1974 > 7 1977 > 9 1980 > 9 1983 > 10 1986 > 1990 10 10 1989 > 1993 40 1993 100 42 1992 > 1996 160 1996 400 365 1995 > 1999 640 1999 1600 1550 1998 > 2002 2560 2002 25600 4260 2001 > 2005 10240 2005 102400 14944 2004 > 2008 40960 2008 409600 ????? 2007 > > However, there's a mistake in the "BASE 1993" column: the numbers after > 1600 should be 6400, 25600, and 102400. Moreover, it's easier to make > the comparison if we use the same years for the actual numbers as we do > for the predicted numbers. With these two changes, I think the table > should be something like: > > ref point = ref point = > year '10 in 1990' '100 in 1993' actual > > 1990 10 25 10 > 1993 40 100 100 > 1996 160 400 750 > 1999 640 1600 2000 > 2002 2560 6400 6500 > 2005 10240 25600 20000? > 2008 40960 102400 ????? > > (It should also be clarified that the number for a given year is that of > Dec 10th for that year.) > > As you can see, the actual numbers (after 1990) are closer to the "1993" > curve than the "1990" curve. > > -Michael Dyck > > ps: In your table, there's also a typo in the "BASE 1971" column: > 1480576 should be 1048576. > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From jon at noring.name Thu Jan 27 12:38:40 2005 From: jon at noring.name (Jon Noring) Date: Thu Jan 27 12:42:01 2005 Subject: [gutvol-d] Re: some feedback to gutvol-d In-Reply-To: <6.1.2.0.0.20050127130831.01b09cc0@mail.fireantproductions.com> References: <6.1.2.0.0.20050127130831.01b09cc0@mail.fireantproductions.com> Message-ID: <12819626250.20050127133840@noring.name> Aaron posted a letter from Barbara Reed to the gutvol-d list: She wrote, in part: > - one wonders what the attitude of Project Gutenberg is to different > editions, and indeed to history in general! For example, it seems > extremely odd that there is only one version of Aesop's fables, which > contains a very strongly-worded criticism of other peoples' work, with > no mention of when it was made - the latest date I could find was 1864, > so it's evidently after that, but doesn't seem to be a modern one. > Similarly with Chaucer's Canterbury Tales, although there is a plethora > of notes I saw nothing to show when the edition was made, which is an > absolutely central feature of an intelligent reading of anything ... Obviously, this shows again the *importance* that the source(s) for each and every PG text be mentioned. Any PG text which does not include this information is inherently broken. Hopefully, when Distributed Proofreaders begins the process of redoing the earliest PG works, this problem will finally be fixed. Jon Noring (p.s., as I've noted before, any PG text which "ASCII-ized" any of the characters in the original source, such as accented characters, is also broken -- these texts should also be repaired or replaced.) From hart at pglaf.org Thu Jan 27 12:51:29 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 27 12:51:30 2005 Subject: [gutvol-d] million book project In-Reply-To: <002801c504ab$069d0680$f69495ce@gw98> References: <002801c504ab$069d0680$f69495ce@gw98> Message-ID: Yes, I have seen both good and bad examples, some very good, some very bad. . . . Michael On Thu, 27 Jan 2005, N Wolcott wrote: > Here is Chapter I of the Steam House by Jules Verne. I think they have a way to go. I wonder if Google is planning to do better? To be fair the Dejavu look at the rather bad images allows them to be mostly made out, so one could reconstruct most of this page. The first line is "TWO THOUSAND POUNDS FOR A HEAD". The images are like with a 300 mp camera, so not surprizing the poor OCR. I think PG texts are easier to read. I understand the books are being destroyed after scanning: might as well put them right into the dumpster and save all the trouble. > > > CHAPTER L > ".nv > " A tst'.WAfct* of two thousand pounds will he paid to any > Mjtr who will drhVrr up, dead or alive, oik* of the prime > movrr? ifMdrney, the Nabob Datulou Taut, nnnuionly > > Jitu'h vv?t'i the ijotirr rratl hy the iiilialntants tif Aurun* > > jUitMul, mi tltr rvniiiijj uf the* 6th of March, 1807, > > A cojy iff the jil,u,;in! h?u! been recently affixed to the > wall iif ?i loiirly and ntiiuu! hunj;ulow on the hanks of > llir iJitiuUttiu, *iud already the conu*r of the paper htvir- > iiijj flu* j;r?'on*l fi*iiiir*-?a fiaiiir cx<,%nited hy tiuinCj .secretly > Utltuii'rt) hy uthnrji'**"WiH jjone, > > The fiiiiti^ had Urea there^ printal in Inrjjo letters, hut > If wafi torn off hy the hand of a ^solitary fakir who > IKij^ed t?y that Ucst^latc *sj>c,it, The name of the Governor > > N Wolcott nwolcott2@post.harvard.edu > From hart at pglaf.org Thu Jan 27 12:55:04 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 27 12:55:05 2005 Subject: !@!Re: [gutvol-d] PG DVDs In-Reply-To: <41F945E3.9030509@ncweb.com> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> <41F945E3.9030509@ncweb.com> Message-ID: On Thu, 27 Jan 2005, Scott Schmucker wrote: > I am neither an accountant nor a lawyer, but my layman's understanding of > this was that a donation of this sort would still be tax deductible, less the > value of the goods that you received in return. For example, if I donated > $100 to an organization and they gave me a stuffed animal valued at $5 in > return, then I would be able to deduct $95 of this $100 donation. Is my > understanding of this inaccurate? I'm no authority, so I have a call in to our local PBS station to find out. mh > > Scott > > Michael Hart wrote: > >> >> One thing to remember, if the DVD are used as "premiums" such as is done >> with PBS/NPR membership drives, then they are legally regarded as being >> "sold" and the "donation" to PBS/NPR, etc., is no longer a "donation," >> and thus no longer tax deductible. >> >> Thus the "fine print" concerning the "sale" of Project Gutenberg eBooks >> would legally apply. >> >> However: >> >> When our DVDs are truly given away free of charge, as has been done >> by various people at public libraries, the Internet Archive, etc., >> that is just fine, also as per the "fine print." >> >> I am not a lawyer, this is not legal advice, >> just what I remember from our lawyers and what >> our own local PBS/NPR stations' disclaimers. >> >> Michael >> >> >> On Wed, 26 Jan 2005, James Linden wrote: >> >>> I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't >>> making any money off them per se, so I don't see a conflict of interest, >>> etc. >>> >>> But heck, that's just my $0.02 CAD! >>> >>> -- James >>> jlinden@projectgutenberg.ca >>> >>> PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send >>> you. >>> >>> Aaron Cannon wrote: >>> >>>> I received this message a few days ago. I sent it to Greg for his >>>> opinion, but either my messages aren't getting through, or he's been >>>> busy, as I haven't received a response. >>>> >>>> Anyway, I thought I'd forward it to the list for discussion. >>>> Personally I don't see that there would be a problem, as long as the >>>> DVDs are given as gifts, and not sold. Nevertheless, what do the rest >>>> of you think? >>>> >>>> >>>> "Hi Project Gutenberg folks, >>>> >>>> First of all, thanks for all the wonderful work you do. I appreciate >>>> it so much. >>>> >>>> I had a question about using Project Gutenberg's CDs or DVDs as "free" >>>> gifts in a membership drive for an advocacy group. I'm doing some >>>> work for Public Knowledge (a DC based advocacy group that lobbies for >>>> the public interest in intellectual property issues-- I'm sure you've >>>> crossed paths with them), and I'm helping them improve their online >>>> outreach and fundraising. >>>> >>>> One of the things we'd like to do is let people show their support by >>>> becoming members. We'd like to offer people who donate gifts such as >>>> Public Knowledge t-shirts, or books like Lessig's "Free Culture", and >>>> we'd like for some of these items to be a celebration of Creative >>>> Commons licenses and the public domain. Along those lines, we're >>>> interested in giving members Project Gutenberg DVDs. >>>> >>>> Would that be possible? Are there any rights issues? And what would >>>> be the best way to get a hundred or so copies that we could send out? >>>> If there's a suggested donation when copies of the CD are made, what >>>> kind of donation would you consider to be fair? >>>> >>>> Thanks for you time, and I look forward to hearing from you, >>>> >>>> -Holmes Wilson" >>> >>> _______________________________________________ >>> gutvol-d mailing list >>> gutvol-d@lists.pglaf.org >>> http://lists.pglaf.org/listinfo.cgi/gutvol-d >>> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d >> >> > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From hart at pglaf.org Thu Jan 27 13:03:38 2005 From: hart at pglaf.org (Michael Hart) Date: Thu Jan 27 13:03:39 2005 Subject: !@!Re: [gutvol-d] Moving and Removing eBooks In-Reply-To: <000301c5048a$06e50f00$f69495ce@gw98> References: <20050125214122.59C804BE64@ws1-1.us4.outblaze.com> <000301c5048a$06e50f00$f69495ce@gw98> Message-ID: On Thu, 27 Jan 2005, N Wolcott wrote: > How about removing Mark Twain? Actually, Mark Twain is possibly the most removed author in America. mh > ----- Original Message ----- > From: "D. Starner" > To: "Michael S. Hart" ; "Project Gutenberg Volunteer > Discussion" > Sent: Tuesday, January 25, 2005 4:41 PM > Subject: Re: !@!Re: [gutvol-d] Moving and Removing eBooks > > >> "Michael Hart" writes: >> >>> As for the recent request to delete Shakespeare #100, I never heard >>> back from my reply, so I never got to the bottom line reasoning >>> behind that request. Perhaps it was only because of copyright, >>> or because it was such an early effort that it needed proofing >>> to bring it up to today's standards. >> >> I sent a message to gutvol-d about it earlier. It's a copyrighted >> edition of a public domain text, and there's nothing in that file >> to indicate that the edition was first published in 1931 and isn't >> just an electronic copy of a public domain edition. And even if it >> is a 1931 edition, the copyright notices are wrong; electronic editions >> of texts do not get a new copyright. In many ways, it's the epitome >> of stuff PG opposes. >> >>> There is also great, and worthwhile, concern about the Longfellow >>> translation of Dante. However, it is also an example that should >>> be preserved as an indication of history, even if we recommend >>> the Cary translation or any other as being of better quality. >> >> Who cares about Dante; it's Longfellow. Different translations, even >> different editions are interesting as long as distinguishing marks >> are included in the file. >> -- >> ___________________________________________________________ >> Sign-up for Ads Free at Mail.com >> http://promo.mail.com/adsfreejump.htm >> >> _______________________________________________ >> gutvol-d mailing list >> gutvol-d@lists.pglaf.org >> http://lists.pglaf.org/listinfo.cgi/gutvol-d >> > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From cannona at fireantproductions.com Thu Jan 27 13:24:29 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Jan 27 13:24:46 2005 Subject: !@!Re: [gutvol-d] PG DVDs In-Reply-To: <41F945E3.9030509@ncweb.com> References: <6.1.2.0.0.20050126142912.01afebf8@mail.fireantproductions.com> <41F801A0.9090502@projectgutenberg.ca> <41F945E3.9030509@ncweb.com> Message-ID: <6.1.2.0.0.20050127151914.01c3a448@mail.fireantproductions.com> This begs the question, what value does the DVD have, if any? After all, the DVD is available for free on the internet. For example, if I give someone a piece of paper after they give my organization a donation, they don't necessarily have to deduct the value of the piece of paper because the monetary value of the item received is so small to make it moot. The same could be argued about the DVDs. That's my interpretation. Sincerely Aaron Cannon At 01:49 PM 1/27/2005, you wrote: >I am neither an accountant nor a lawyer, but my layman's understanding of >this was that a donation of this sort would still be tax deductible, less >the value of the goods that you received in return. For example, if I >donated $100 to an organization and they gave me a stuffed animal valued >at $5 in return, then I would be able to deduct $95 of this $100 >donation. Is my understanding of this inaccurate? > >Scott > >Michael Hart wrote: > >> >>One thing to remember, if the DVD are used as "premiums" such as is done >>with PBS/NPR membership drives, then they are legally regarded as being >>"sold" and the "donation" to PBS/NPR, etc., is no longer a "donation," >>and thus no longer tax deductible. >> >>Thus the "fine print" concerning the "sale" of Project Gutenberg eBooks >>would legally apply. >> >>However: >> >>When our DVDs are truly given away free of charge, as has been done >>by various people at public libraries, the Internet Archive, etc., >>that is just fine, also as per the "fine print." >> >>I am not a lawyer, this is not legal advice, >>just what I remember from our lawyers and what >>our own local PBS/NPR stations' disclaimers. >> >>Michael >> >> >>On Wed, 26 Jan 2005, James Linden wrote: >> >>> I say "Let 'em at 'em!" (That's Texas speak for "yes"). They aren't >>> making any money off them per se, so I don't see a conflict of interest, etc. >>> >>>But heck, that's just my $0.02 CAD! >>> >>>-- James >>>jlinden@projectgutenberg.ca >>> >>>PS: Aaron, email your snail mail addy, I have a stack of PG CDs to send you. >>> >>>Aaron Cannon wrote: >>> >>>>I received this message a few days ago. I sent it to Greg for his >>>>opinion, but either my messages aren't getting through, or he's been >>>>busy, as I haven't received a response. >>>> >>>>Anyway, I thought I'd forward it to the list for discussion. >>>>Personally I don't see that there would be a problem, as long as the >>>>DVDs are given as gifts, and not sold. Nevertheless, what do the rest >>>>of you think? >>>> >>>> >>>>"Hi Project Gutenberg folks, >>>> >>>>First of all, thanks for all the wonderful work you do. I appreciate >>>>it so much. >>>> >>>>I had a question about using Project Gutenberg's CDs or DVDs as "free" >>>>gifts in a membership drive for an advocacy group. I'm doing some >>>>work for Public Knowledge (a DC based advocacy group that lobbies for >>>>the public interest in intellectual property issues-- I'm sure you've >>>>crossed paths with them), and I'm helping them improve their online >>>>outreach and fundraising. >>>> >>>>One of the things we'd like to do is let people show their support by >>>>becoming members. We'd like to offer people who donate gifts such as >>>>Public Knowledge t-shirts, or books like Lessig's "Free Culture", and >>>>we'd like for some of these items to be a celebration of Creative >>>>Commons licenses and the public domain. Along those lines, we're >>>>interested in giving members Project Gutenberg DVDs. >>>> >>>>Would that be possible? Are there any rights issues? And what would >>>>be the best way to get a hundred or so copies that we could send out? >>>>If there's a suggested donation when copies of the CD are made, what >>>>kind of donation would you consider to be fair? >>>> >>>>Thanks for you time, and I look forward to hearing from you, >>>> >>>>-Holmes Wilson" >>> >>>_______________________________________________ >>>gutvol-d mailing list >>>gutvol-d@lists.pglaf.org >>>http://lists.pglaf.org/listinfo.cgi/gutvol-d >>_______________________________________________ >>gutvol-d mailing list >>gutvol-d@lists.pglaf.org >>http://lists.pglaf.org/listinfo.cgi/gutvol-d >> > >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From vze3rknp at verizon.net Thu Jan 27 14:00:11 2005 From: vze3rknp at verizon.net (Juliet Sutherland) Date: Thu Jan 27 14:00:17 2005 Subject: [gutvol-d] million book project In-Reply-To: <002801c504ab$069d0680$f69495ce@gw98> References: <002801c504ab$069d0680$f69495ce@gw98> Message-ID: <41F9646B.80401@verizon.net> What the Million Book Project does is fundamentally different from what PG does. MBP is in the business of providing scans of actual pages. In order to facilitate searching, they run OCR to obtain text which is used by the search engine, the general idea being that any word that's important will turn up enough times in a book to have been OCR'd correctly at least once. The djvu system allows them to take the user to the correct page, and highlight the word(s) found. Virtually all of the image archives do something similar to this. The key words appear to be "full text search". What MBP does NOT do is provide corrected OCR so that the book is readable in plain text. Unlike most of the other archives, which bury the OCR'd text so that only the search engine can see it, MBP allows access to the uncorrected OCR. Examples like the one you give show exactly what DP does. Turn nonsense into real text. MBP is very friendly and encourages DP and PG to make as much use of their scans as possible. Having real text versions of their books is only to their advantage and that of their users. Virtually none of the other image archives provide corrected text. It is simply cost-prohibitive to do so. What DP does as a volunteer effort would be extremely costly to replicate. For this reason alone, I believe that Google will be using raw OCR behind their scans. On new material, raw OCR from a good program can be very close to 100% correct. It is the older material that causes problems. BTW, MBP does NOT destroy the books they are having scanned. The books are sent to India (or China) for scanning on orbital scanners (think a really good digital camera). Orbital scanning still requires a human to turn the pages but it is much more gentle on the books than flatbed scanners. Books that are on loan from US Libraries will be returned to them. Other material is given to libraries and schools in India when it is no longer needed. The Internet Archive, home of the MBP scans, is also working with the University of Toronto to use a robotic scanner to scan their books. The robot does not damage the books and they are returned to the shelves after scanning. The process is still ramping up, but they are cranking out more books all the time. I'm told that a few DPers from that area are working (for money, lucky them) with the robot. Someone posted on this list (I think) awhile ago with a really good list of what true text (corrected OCR) books can be used for that is not possible with image scans. The difference is huge. JulietS DP Administrator N Wolcott wrote: > Here is Chapter I of the Steam House by Jules Verne. I think they have > a way to go. I wonder if Google is planning to do better? To be fair > the Dejavu look at the rather bad images allows them to be mostly made > out, so one could reconstruct most of this page. The first line is > "TWO THOUSAND POUNDS FOR A HEAD". The images are like with a 300 mp > camera, so not surprizing the poor OCR. I think PG texts are easier to > read. I understand the books are being destroyed after scanning: might > as well put them right into the dumpster and save all the trouble. > * > > CHAPTER L > "?nv > * > > * > > " A tst'.WAfct* of two thousand pounds will he paid to any > Mjtr who will drhVrr up, dead or alive, oik* of the prime > movrr? ifMdrney, the Nabob Datulou Taut, nnnuionly > > Jitu'h *vv?t'i *the ijotirr rratl hy the iiilialntants tif Aurun* > > jUitMul, mi tltr rvniiiijj uf the* 6th of March, 1807, > > A cojy iff the jil,u,;in! h?u! been recently affixed to the > wall iif ?i loiirly and ntiiuu! hunj;ulow on the hanks of > llir iJitiuUttiu, *iud already the conu*r of the paper htvir- > iiijj flu* j;r?'on*l fi*iiiir*-?a fiaiiir cx<,%nited hy tiuinCj .secretly > Utltuii'rt) hy uthn^r ji'**"WiH jjone, > > The fiiiiti^ had Urea there^ printal in Inrjjo letters, hut > If wafi torn off hy the hand of a ^solitary fakir who > IKij^ed t?y that Ucst^latc *sj>c,it, The name of the Governor > > * > *N Wolcott nwolcott2@post.harvard.edu * > >* >* >------------------------------------------------------------------------ >* >_______________________________________________ >gutvol-d mailing list >gutvol-d@lists.pglaf.org >http://lists.pglaf.org/listinfo.cgi/gutvol-d >* > From donovan at abs.net Thu Jan 27 15:50:45 2005 From: donovan at abs.net (D Garcia) Date: Thu Jan 27 15:51:45 2005 Subject: Reworks [Was: Re: [gutvol-d] Re: some feedback to gutvol-d] In-Reply-To: <12819626250.20050127133840@noring.name> References: <6.1.2.0.0.20050127130831.01b09cc0@mail.fireantproductions.com> <12819626250.20050127133840@noring.name> Message-ID: <200501271850.45528.donovan@abs.net> On Thursday 27 January 2005 03:38 pm, Jon Noring wrote: > Hopefully, when Distributed Proofreaders begins the process of redoing > the earliest PG works, this problem will finally be fixed. > > Jon Noring > > (p.s., as I've noted before, any PG text which "ASCII-ized" any of the > characters in the original source, such as accented characters, is also > broken -- these texts should also be repaired or replaced.) I've been doing some of these independently, see etexts 384, 430, 4787 and 1590. Others on the way as I get physical copies and time. :) David From Bowerbird at aol.com Thu Jan 27 16:54:06 2005 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jan 27 16:54:24 2005 Subject: [gutvol-d] re: "cost-prohibitive" Message-ID: juliet said: > Virtually none of the other > image archives provide corrected text. > It is simply cost-prohibitive to do so. > What DP does as a volunteer effort > would be extremely costly to replicate. > For this reason alone, I believe that > Google will be using raw OCR behind their scans. > On new material, raw OCR from a good program > can be very close to 100% correct. > It is the older material that causes problems. provided careful scans at the right resolution, from the right scanner, the right o.c.r. program combined with the right post-o.c.r. software can yield us accuracy even on "older material" that approaches error-free results. to say that it is "extremely costly" to get this is simply not true. it might have been very true three years ago or so. might've even been true last year. it is untrue now. to see how, and to issue real-world challenges, visit my blog regularly over the upcoming weeks. e-mail me for the address if you are interested... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050127/22299888/attachment.html From george at pglaf.org Thu Jan 27 17:58:56 2005 From: george at pglaf.org (George Davis) Date: Thu Jan 27 17:58:58 2005 Subject: [gutvol-d] Free Translations (fwd) Message-ID: The following came into help@. If no one on this list wants to respond, I'll send it along to PG-EU (but would appreciate a recommendation on a specific contact there.) Thanks, [eorge] ---------- Forwarded message ---------- Date: Wed, 26 Jan 2005 23:48:14 +0000 From: Alessandro Ronchi To: help@pglaf.org Subject: Free Translations I have opened a project to translate free books to italian language, with a translator's university. I need a list of books (in these languages: English, French, German, Spanish, Russian) never translated into Italian. If anyone from project gutemberg can send me a "wish list", I will send it to this university and some students will translate it for free, and give it back to the community. hope I explained myself, thanks in advance. -- Alessandro Ronchi http://www.aronchi.org http://www.soasi.com From traverso at dm.unipi.it Fri Jan 28 09:26:50 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Fri Jan 28 09:20:32 2005 Subject: [gutvol-d] Free Translations (fwd) In-Reply-To: (message from George Davis on Thu, 27 Jan 2005 17:58:56 -0800 (PST)) References: Message-ID: <200501281726.j0SHQo320337@posso.dm.unipi.it> >>>>> "George" == George Davis writes: George> The following came into help@. If no one on this list George> wants to respond, I'll send it along to PG-EU (but would George> appreciate a recommendation on a specific contact there.) George> Thanks, George> [eorge] George> ---------- Forwarded message ---------- Date: Wed, 26 Jan George> 2005 23:48:14 +0000 From: Alessandro Ronchi George> To: help@pglaf.org Subject: Free George> Translations Alessandro> I have opened a project to translate free books to italian Alessandro> language, with a translator's university. I need a list of Alessandro> books (in these languages: English, French, German, Alessandro> Spanish, Russian) never translated into Italian. Alessandro> If anyone from project gutemberg can send me a "wish Alessandro> list", I will send it to this university and some students Alessandro> will translate it for free, and give it back to the Alessandro> community. Alessandro> hope I explained myself, thanks in advance. -- Alessandro Alessandro> Ronchi http://www.aronchi.org http://www.soasi.com I have forwarded the message to liberliber@yahoogroups.com, the mailing list of http://www.liberliber.it , the italian PG-like project. I am sure that a long wish list will appear. To begin it, I suggest some exploration books, that are in PG and do not seem to have been translated in italian: Richard Burton, Mungo Park, John Hanning Speke, James Richardson, Gerhard Rohlfs (german), and many more. For french, I have many novelists to suggest, (Gaboriau, Feval, Corbiere, Houssaye, Flor O'Squarr, etc.) but I have to check if translations exist. Carlo Traverso From cannona at fireantproductions.com Fri Jan 28 09:24:03 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Jan 28 09:34:48 2005 Subject: [gutvol-d] Fwd: Proposed CD Navigation Files now available for review Message-ID: <6.1.2.0.0.20050128112358.01c90da0@mail.fireantproductions.com> >From: "John Hagerson" >To: "'Aaron Cannon'" >Subject: Proposed CD Navigation Files now available for review >Date: Fri, 28 Jan 2005 08:04:16 -0600 >X-Mailer: Microsoft Outlook, Build 10.0.6626 > >Four navigation files built for a new Project Gutenberg CD-ROM which >contains primarily non-English electronic books are now available for review >at http://www.aaronandgabby.com/pgcd/ The files allow one to browse the CD >by Author, Language and Author, Language and Title, or Title. > >The files were developed from the Project Gutenberg production prior to book >14700. The Distributed Proofreaders have been especially prolific in >non-English books recently, so it seems that a number of books of recent >production will be omitted regardless of where we draw the line. > >I believe I have included every non-English book produced prior to 14700 >with the exception of three books (7216, 7337, and 12407) where the title >and author were both in Unicode characters that most fonts do not support. >Each of the omitted works is in Chinese. If someone could help me obtain >more information on these works, there is ample space to include them. > >Please respond to the list or directly to mailto:j.hagerson@comcast.net with >your comments regarding the files. > >Thank you. > >Aaron: Before you forward this to the list, please make sure that the http >download works. My attempts to view the directory were met with a 403 error. >Thank you. -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From ag737 at freenet.carleton.ca Fri Jan 28 12:33:43 2005 From: ag737 at freenet.carleton.ca (Wallace J.McLean) Date: Fri Jan 28 12:33:52 2005 Subject: [gutvol-d] million book project Message-ID: <376c78378e3d.378e3d376c78@ncf.ca> > ----- Original Message ----- > From Juliet Sutherland > Date Thu, 27 Jan 2005 17:00:11 -0500 > To Project Gutenberg Volunteer Discussion > Subject Re: [gutvol-d] million book project > Virtually none of the other image archives provide corrected text. It is > simply cost-prohibitive to do so. What DP does as a volunteer effort > would be extremely costly to replicate. For this reason alone, I believe > that Google will be using raw OCR behind their scans. On new material, > raw OCR from a good program can be very close to 100% correct. It is the > older material that causes problems. I actually find, within limits, the opposite to be true. Material from about 1905 through to about 1955 OCRs very, very well. Material before 1905 OCRs progressively worse the further back you go, although "bright" characters on pre-acid paper are almost as good as early 20th-century stuff. And after the 1950s, I find the OCRability goes down again. Many post-PC printed books, say from the mid-1980s on, OCR almost as poorly as pre-20th century stuff; there's just something about those typefaces, I guess. From shalesller at writeme.com Fri Jan 28 13:59:42 2005 From: shalesller at writeme.com (D. Starner) Date: Fri Jan 28 13:59:52 2005 Subject: [gutvol-d] Free Translations (fwd) Message-ID: <20050128215942.392454BE64@ws1-1.us4.outblaze.com> A translation of Flatland to Italian has been on the request list for a long, long time. I'm sure the person who made the request has made do, but it's still something to remember. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm From cannona at fireantproductions.com Sat Jan 29 21:19:16 2005 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sat Jan 29 21:46:25 2005 Subject: [gutvol-d] possible publicity? Message-ID: <6.1.2.0.0.20050129231657.01b0aab0@mail.fireantproductions.com> I believe that Gutenberg received some sort of major press coverage in the U.K. today. I say this because we received over a hundred DVD requests, most from there. Does anyone have any further details on this? Thanks. Sincerely Aaron Cannon -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) From traverso at dm.unipi.it Sun Jan 30 03:10:52 2005 From: traverso at dm.unipi.it (Carlo Traverso) Date: Sun Jan 30 03:04:30 2005 Subject: [gutvol-d] New canonical URL for etext directories In-Reply-To: <41F7DBEF.1000003@perathoner.de> (message from Marcello Perathoner on Wed, 26 Jan 2005 19:05:35 +0100) References: <41F56061.4060501@perathoner.de> <41F6CBEC.8020802@adelaide.edu.au> <41F7DBEF.1000003@perathoner.de> Message-ID: <200501301110.j0UBAqi12164@posso.dm.unipi.it> It seems to me that we are redirecting too much: we are even redirecting gutenberg.org! I have tried ebook n. 14837, to avoid being redirected I typed http://www.gutenberg.org/dirs/1/4/8/3/14837/ and it is OK; I click on 14837-h and it is still OK; I click on 14837-h.htm and instead of getting http://www.gutenberg.org/dirs/1/4/8/3/14837/14837-h.htm that is undoubtedly there, I am redirected to http://www.gutenberg.org/catalog/world/file?file=1/4/8/3/14837/14837-h/14837-h.htm that gives I see no such file here! (1/4/8/3/14837/14837-h/14837-h.htm) Typing http://www.gutenberg.org/dirs/1/4/8/3/14837/14837-h.htm sometimes I am redirected as above, sometimes I get: > > Page Not Found > > Sorry, but the page you tried to access can no longer be found under > that url. > > In November 2003, Project Gutenberg's Web pages moved from promo.net > to our new host ibiblio.org. Not all of the content from promo.net was > moved to ibiblio.org, and some of the content was reorganized. > > Also, we are gradually updating all eBooks older than #10.000, and in > the process, moving them to a new filing system. > > Please use the site map to find what you are looking for. We apologize > for the inconvenience. Thanks for visiting Project Gutenberg, and > happy reading! > that does not make sense at all, and once I have been able to get The Real Thing, showing that it exists! Carlo Traverso From marcello at perathoner.de Sun Jan 30 10:09:48 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jan 30 10:27:27 2005 Subject: [gutvol-d] New canonical URL for etext directories In-Reply-To: <200501301110.j0UBAqi12164@posso.dm.unipi.it> References: <41F56061.4060501@perathoner.de> <41F6CBEC.8020802@adelaide.edu.au> <41F7DBEF.1000003@perathoner.de> <200501301110.j0UBAqi12164@posso.dm.unipi.it> Message-ID: <41FD22EC.90903@perathoner.de> Carlo Traverso wrote: > It seems to me that we are redirecting too much: we are even > redirecting gutenberg.org! I have tried ebook n. 14837, to avoid being > redirected I typed > > http://www.gutenberg.org/dirs/1/4/8/3/14837/ > > and it is OK; I click on 14837-h and it is still OK; I click on > 14837-h.htm and instead of getting > http://www.gutenberg.org/dirs/1/4/8/3/14837/14837-h.htm that is > undoubtedly there, I am redirected to > > http://www.gutenberg.org/catalog/world/file?file=1/4/8/3/14837/14837-h/14837-h.htm > > that gives > > I see no such file here! (1/4/8/3/14837/14837-h/14837-h.htm) That is the source of the problem! ibiblio is experiencing serious file server overload to the point of complete failure. If the web server cannot get an answer from the file server in a reasonable amount of time it calls the error page (even if the file _is_ on the file server.) Tomorrow ibiblio will move the ftp directories to a new file server. That should fix the problems, at least for some time. -- Marcello Perathoner webmaster@gutenberg.org From marcello at perathoner.de Sun Jan 30 10:43:54 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jan 30 10:43:56 2005 Subject: [gutvol-d] New canonical URL for etext directories In-Reply-To: <200501301110.j0UBAqi12164@posso.dm.unipi.it> References: <41F56061.4060501@perathoner.de> <41F6CBEC.8020802@adelaide.edu.au> <41F7DBEF.1000003@perathoner.de> <200501301110.j0UBAqi12164@posso.dm.unipi.it> Message-ID: <41FD2AEA.9050206@perathoner.de> Carlo Traverso wrote: > and it is OK; I click on 14837-h and it is still OK; I click on > 14837-h.htm and instead of getting > http://www.gutenberg.org/dirs/1/4/8/3/14837/14837-h.htm that is > undoubtedly there, I am redirected to That URL is wrong, try this one: http://www.gutenberg.org/dirs/1/4/8/3/14837/14837-h/14837-h.htm Works for me. -- Marcello Perathoner webmaster@gutenberg.org From Gutenberg9443 at aol.com Sun Jan 30 15:20:14 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Sun Jan 30 15:20:34 2005 Subject: [gutvol-d] date-sensitive info about ebook purchase Message-ID: <103.59e74a28.2f2ec5ae@aol.com> Hi-- Just so you'll know--JAN 31 is THE LAST day that the eBookWise 1500 will be priced at $99. I don't know what it will go to after that, but the original price, when the device had a different name, was about $300. With the sincerest apologies to my Rocket, which has been close to my heart for the last five years, I now like the 1500 better. It's a very fine machine, and with the expenditure of another $60 for peripherals, which I got a couple of weeks ago but have not yet installed, it will hold over 300 books. It also allows editing in my normal handwriting, allowing me to insert extra blank pages to write on if necessary. We are still recovering from my husband's having fried his computer last Monday. This has involved the expenditure of about $600 (would have been a lot more if we hadn't found a complete desktop system for $299 with a one-year commitment to AOL, which is what we use anyway) and the moving of a total of about 40 gigabytes of programs and data, all of which I have been doing. I hope to be through sometime next week. But I'm not going to complain too much, because since my desktop expired last year I've been using my laptop as my main computer. Now I get to keep the brand new desktop and my laptop goes to my husband, who greatly prefers laptops. As I greatly prefer a desktop, we're both happy. I'll be glad to be able to get back to doing real work, though. Anyway, if you go February 1 to get an ebook reader and go into acute sticker shock, don't say I didn't warn you. If you don't want a ebook reader it doesn't matter anyway. My husband is one of those people who has given an ebook reader a real try-out and doesn't like it, so I know such people exist. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050130/485bd643/attachment.html From gbnewby at pglaf.org Sun Jan 30 15:46:18 2005 From: gbnewby at pglaf.org (Greg Newby) Date: Sun Jan 30 15:46:19 2005 Subject: [gutvol-d] date-sensitive info about ebook purchase In-Reply-To: <103.59e74a28.2f2ec5ae@aol.com> References: <103.59e74a28.2f2ec5ae@aol.com> Message-ID: <20050130234618.GA9482@pglaf.org> On Sun, Jan 30, 2005 at 06:20:14PM -0500, Gutenberg9443@aol.com wrote: > Hi-- > > Just so you'll know--JAN 31 is THE LAST day that the eBookWise 1500 will be > priced at $99. I don't know what it will go to after that, but the original > price, when the device had a different name, was about $300. > > With the sincerest apologies to my Rocket, which has been close to my heart > for the last five years, I now like the 1500 better. It's a very fine machine, > and with the expenditure of another $60 for peripherals, which I got a > couple of weeks ago but have not yet installed, it will hold over 300 books. It > also allows editing in my normal handwriting, allowing me to insert extra blank > pages to write on if necessary. For what it's worth: I bought one of these, too, based on Anne's positive review. I haven't used a eBook reader before, so don't have anything to compare it to. This is a great little machine, and I've read several thousand pages on it. Highly recommended. The little proggies for dumping your own content on it all require Windows, so I've yet to try them (soon...). Unlike my MP3 player and other devices, you cannot just dump files on the reader via USB. (Ok, you can - but the reader won't display them or acknowledge them.) The main limitation is the available literature. Most of the contemporary literature consists of lesser-known authors, or lesser known works from well-known authors. For example, I used my $20 coupon to buy older novels by Greg Bear & Dan Simmons. Quality stuff, but well over 10 years old. Evidently, the mainstream publishers are not putting their mainstream works onto the Fictionwise site - maybe they're elsewhere. My strong suspicion is that many the works on the Fictionwise site are those that are owned by authors, not publishers. So, right now, this device doesn't replace bn.com or whatever for my reading of contemporary works. Still, it's pretty darned good for a $99 device. -- Greg From Gutenberg9443 at aol.com Mon Jan 31 07:23:48 2005 From: Gutenberg9443 at aol.com (Gutenberg9443@aol.com) Date: Mon Jan 31 07:23:56 2005 Subject: [gutvol-d] date-sensitive info about ebook purchase Message-ID: <8d.1f9e780f.2f2fa784@aol.com> In a message dated 1/30/2005 4:46:21 PM Mountain Standard Time, gbnewby@pglaf.org writes: Evidently, the mainstream publishers are not putting their mainstream works onto the Fictionwise site - maybe they're elsewhere. My strong suspicion is that many the works on the Fictionwise site are those that are owned by authors, not publishers. So, right now, this device doesn't replace bn.com or whatever for my reading of contemporary works. There is a lot of new stuff at FictionWise. It's in RB format and can be dumped straight into the ebook. Probably most of the stuff on the sites is older and the author has gotten copyright revision, but more and more publishers are getting the idea and putting new works up. For example, THE DA VINCI CODE went up on FictionWise about the same time it was released in hardback. Its success in eformat has certainly caught the eyes of other mainstream publishers. It's a good beginning, but it IS a beginning. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20050131/ea557602/attachment.html From marcello at perathoner.de Sun Jan 30 11:01:21 2005 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jan 31 10:43:01 2005 Subject: [gutvol-d] possible publicity? In-Reply-To: <6.1.2.0.0.20050129231657.01b0aab0@mail.fireantproductions.com> References: <6.1.2.0.0.20050129231657.01b0aab0@mail.fireantproductions.com> Message-ID: <41FD2F01.1050306@perathoner.de> Aaron Cannon wrote: > I believe that Gutenberg received some sort of major press coverage in > the U.K. today. I say this because we received over a hundred DVD > requests, most from there. Does anyone have any further details on this? We had 300 referrers from this page: http://www.thedvdforums.com/forums/showthread.php?t=347318 To see the contents you have to register, so that part is left as an exercise to the reader. -- Marcello Perathoner webmaster@gutenberg.org From holden.mcgroin at dsl.pipex.com Mon Jan 31 15:24:33 2005 From: holden.mcgroin at dsl.pipex.com (Holden McGroin) Date: Mon Jan 31 15:24:46 2005 Subject: [gutvol-d] possible publicity? In-Reply-To: <41FD2F01.1050306@perathoner.de> References: <6.1.2.0.0.20050129231657.01b0aab0@mail.fireantproductions.com> <41FD2F01.1050306@perathoner.de> Message-ID: <41FEBE31.8090404@dsl.pipex.com> Marcello Perathoner wrote: > > We had 300 referrers from this page: > > http://www.thedvdforums.com/forums/showthread.php?t=347318 > > To see the contents you have to register, so that part is left as an > exercise to the reader. Here ya go :-) Holden ---------- Chunky Free DVD or CD full of books (over 9k on the dvd) over 10,000 Free e-books are now available from the Project Gutenberg website, and if you're interested, you can get a free CD or DVD containing either the best selection, or the 9,400 or so up until Dec 2003. For your free DVD or CD, goto http://www.gutenberg.org goto http://www.gutenberg.org/cdproject/dvdreq-usa.html (if you're located in the USA) or http://www.gutenberg.org/cdproject/dvdreq-int.html (if you're other than, for International disks.) Please remember that the Project Gutenberg is a non-profit outfit, and if you enjoy the books, (which include, for example , all shakespeare, etc..) then please make a donation to them. They're also looking for people to distribute it for free, so if you;re in a charitable mood, think about getting one, and passing a couple of copies on to your local schools. (I work in one, and believe me, they'd appreciate it) Price is free, so this is a bargain. Cheers, Chunks ---------- From kthagen at eliteprep.com Wed Jan 26 15:24:11 2005 From: kthagen at eliteprep.com (Karl Hagen) Date: Wed Feb 2 20:00:32 2005 Subject: [gutvol-d] Copyright Office requests comments on orphaned works Message-ID: <41F8269B.80400@eliteprep.com> On BoingBoing, I ran across a link to this notice of inquiry, whose importance to PG seems glaringly obvious: SUMMARY: The Copyright Office seeks to examine the issues raised by ``orphan works,'' i.e., copyrighted works whose owners are difficult or even impossible to locate. Concerns have been raised that the uncertainty surrounding ownership of such works might needlessly discourage subsequent creators and users from incorporating such works in new creative efforts or making such works available to the public. This notice requests written comments from all interested parties. Specifically, the Office is seeking comments on whether there are compelling concerns raised by orphan works that merit a legislative, regulatory or other solution, and what type of solution could effectively address these concerns without conflicting with the legitimate interests of authors and right holders. Full document at http://a257.g.akamaitech.net/7/257/2422/01jan20051800/edocket.access.gpo.gov/2005/05-1434.htm From hart at pglaf.org Fri Jan 28 10:06:10 2005 From: hart at pglaf.org (Michael Hart) Date: Wed Feb 2 20:00:36 2005 Subject: [gutvol-d] Error Correction Data Needed Message-ID: [Please excuse cross-posting.] I have been doing some additional research on error correction, particularly as it might apply to Project Gutenberg eBooks, but also in more general terms. In my previous thoughts on this subject I came to the conclusions that perhaps 1/2 of the errors present would/could/should be found via each additional proofreading pass, provided it was done by an external source to the previous proofreadings. [Proofreaders do tend to miss the same errors more often the second time around.] However, my most recent research, in conjunctions with the head of error correction at a major publisher, leads me to think 1/3 of errors might be found per pass, instead of the previous 1/2. If any of you have any suggestions as to what these figures are, please let me know. Thanks So Much!!! Michael S. Hart Project Gutenberg