From Bowerbird at aol.com Sun Jul 2 01:14:33 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Jul 2 01:14:39 2006 Subject: [gutvol-d] the little red hen Message-ID: <3af.46261f0.31d8da69@aol.com> well, well, ok, one of my favorite stories -- the little red hen -- has now hit p.g. with its illustrations, as e-text #18735. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060702/5ad5e3d9/attachment.html From tony at baechler.net Sun Jul 2 23:58:10 2006 From: tony at baechler.net (Tony Baechler) Date: Sun Jul 2 23:58:01 2006 Subject: [gutvol-d] ftp.archive.org In-Reply-To: <44A5B5A3.1010703@aol.com> References: <7.0.1.0.2.20060626114129.032ee4e0@baechler.net> <44A04DB1.6070608@aol.com> <7.0.1.0.2.20060628003711.03fdd800@baechler.net> <7.0.1.0.2.20060630005304.03354d60@baechler.net> <44A5B5A3.1010703@aol.com> Message-ID: <7.0.1.0.2.20060702235033.03360330@baechler.net> Hello list. I really appreciate the help with rsync, especially David's sample command. However, I don't think people understand what I want. I do not want a full mirror of the PG archive. I do have a partial mirror but it's very specialized. I don't follow the standard PG directory structure. I only download English books in plain text. I don't want html, non-English, or 8-bit. I have books divided up into 1,000 per directory. For example, my etext18 contains files 18000-18999 etc. For reposts, etext0\ contains books 0-999. This is completely different from the PG structure. Once I download a file, I don't have any reason to retrieve it again unless it gets updated. If rsync will grab all the files I need and put them in my customized structure, rsync will help. However, based on my reading of the help, it doesn't do that. My apologies for the Windows comment. I found it and it's part of cygwin as written below. The syntax actually isn't as bad as I thought. It reminds me of wget which I use frequently. It would be perfect for mirroring a primary server to a backup machine. Again, I'm not trying to mirror here so it doesn't do what I want based on my understanding of comments made on this list and the rsync help. At 04:37 PM 6/30/06 -0700, you wrote: >Rsync's available for Windows as part of the cygwin package. Just like >FTP or wget you can tell rsync to get only the stuff you want. and >unlike FTP or wget it will only download the files that need updating, >without you having to wait several hours for it to skip over every file >that hasn't changed. > >I admit it can be confusing since it's a very powerful too. I was >talking about it with Aaron Cannon and he says it's a better way to make >a "mirror" of PG (with or without specific files that you want. -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.1.394 / Virus Database: 268.9.8/380 - Release Date: 6/30/06 From tony at baechler.net Mon Jul 3 00:06:03 2006 From: tony at baechler.net (Tony Baechler) Date: Mon Jul 3 00:05:56 2006 Subject: [gutvol-d] ftp.archive.org In-Reply-To: References: <7.0.1.0.2.20060626114129.032ee4e0@baechler.net> <44A04DB1.6070608@aol.com> <7.0.1.0.2.20060628003711.03fdd800@baechler.net> <7.0.1.0.2.20060630005304.03354d60@baechler.net> <44A5B5A3.1010703@aol.com> Message-ID: <7.0.1.0.2.20060702235853.0336ea40@baechler.net> Hello David. I tried Unison extensively because it's available for both Windows and Debian. I thought it would be great for miroring the Debian files to the Windows machine etc. I found it far worse to use than rsync. I followed the instructions exactly but I never got it to work. It uses a complicated url scheme that doesn't conform to standards. While it uses a protocol similar to rsync, it isn't the same. It has no means of encrypting the files. The documentation makes it clear that it is not secure and shouldn't be used for anything critical. I found that scp works a lot better, is far more secure and actually works. I eventually got the files that way and had ssh encryption as a bonus. Believe me, I would rather learn rsync than try that again. I'm sure it's useful rfor some applications but not in this case. Also both systmes must run the same version which means that all old versions need to be kept available. As you probably know, Debian stable can be a few versions behind the regular development. As it turned out, the versions didn't match so I had to find an older Windows version to get it to work. Thanks anyway. At 07:56 PM 6/30/06 -0400, you wrote: > How about using Unison? > > http://www.cis.upenn.edu/~bcpierce/unison/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.1.394 / Virus Database: 268.9.8/380 - Release Date: 6/30/06 From gbnewby at pglaf.org Mon Jul 3 03:45:05 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Mon Jul 3 03:45:07 2006 Subject: [gutvol-d] ftp.archive.org In-Reply-To: <7.0.1.0.2.20060628004644.03fd5a20@baechler.net> References: <7.0.1.0.2.20060626114129.032ee4e0@baechler.net> <20060626223514.GA11041@pglaf.org> <7.0.1.0.2.20060628004644.03fd5a20@baechler.net> Message-ID: <20060703104505.GB24484@pglaf.org> On Wed, Jun 28, 2006 at 12:52:27AM -0700, Tony Baechler wrote: > Hi. Thanks very much, the readingroo.ms server seems much > faster. When I checked last, snowy.arsc.alaska.edu seemed to be a > few hours behind the other master sites. I am no longer able to This was a little mysterious... turns out I have two independent copies on snowy.arsc.alaska.edu. The one at ftp://snowy.arsc.alaska.edu/mirrors/gutenberg is not actually a mirror. It receives a live copy of files as they are posted to readingoo.ms and our main server at ibiblio.org (which runs gutenberg.org's server). The one at http://snowy.arsc.alaska.edu/gutenberg is just a regular mirror that I pull back from ibiblio, daily. That explains why it's not quite current. I'm in the middle of setting up some additional mirrors, so this will probably continue to change a bit. -- Greg > connect to ftp.archive.org, it just times out. I am not a Debian > expert but I do run a Debian server and know a reasonable amount > about it. What needs doing? I am not really a programmer but I know > how to install packages and set up things for the most part. If > there is something that needs to be done, let me know and I'll see. > > At 03:35 PM 6/26/06 -0700, you wrote: > > >I hope this helps. My guess is the readingroo.ms server will > >give you the best throughput (though it will have some > >brief downtime, then possibly be heavily loaded during the > >world ebook fair, http://www.worldebookfair.com). > > > >Are there any Debian whizzes on this list who might want to help look > >after the readingroo.ms server with me? > > > > -- Greg > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.1.394 / Virus Database: 268.9.5/376 - Release Date: 6/26/06 > From gbnewby at pglaf.org Tue Jul 4 01:44:20 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jul 4 01:44:23 2006 Subject: [gutvol-d] New DVD ISO to try In-Reply-To: <7.0.1.0.2.20060628084333.0426a7b0@baechler.net> References: <20060626093237.GA27369@pglaf.org> <7.0.1.0.2.20060628084333.0426a7b0@baechler.net> Message-ID: <20060704084420.GA11229@pglaf.org> Thanks to everyone who provided feedback and ideas for the new DVD image. I've made a new image that contains *all* of the plain text titles (zipped), plus a bunch of multimedia and some nice HTML with images. Feedback welcome: http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special-work/ my notes on what's included: http://snowy.arsc.alaska.edu/gbn/pgimages/newdvd.txt As you will read at the first URL, I went ahead and included lots of our copyrighted content. Then, I said that the DVD could be given away, but NOT sold. I like this. Enjoy! -- Greg From cannona at fireantproductions.com Tue Jul 4 07:39:20 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Tue Jul 4 07:39:48 2006 Subject: [gutvol-d] New DVD ISO to try References: <20060626093237.GA27369@pglaf.org><7.0.1.0.2.20060628084333.0426a7b0@baechler.net> <20060704084420.GA11229@pglaf.org> Message-ID: <000201c69f77$af35a190$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This looks great. One thing you could change would be to redo the title index and distribute all of the "the" titles to their proper places in the lists. The reason is that the t index for titles is huge. Either that or split it. HTML files that are too large cause problems for my screen reader, and I imagine that they might for some older systems as well, but I could be wrong. As an alternative, or in addition, I would suggest also providing a text index of titles and authors. Before you make the ISO, you might slap the autorun.inf file into the root directory. Use the one from the CD as the DVD autorun doesn't work on older systems. You shouldn't need to change anything, as it already points to index.html. Finally, I would like to write up a short set of instructions on how to "Install" a copy of the dvd on your hard drive. It wouldn't be anything fancy, just create a folder, copy the contents of the disc to that folder, and create a short cut to index.html and place it either on the desktop or under programs. If any mac users would like to write some instructions for their OS, they would be appreciated I'm sure. Anyone who uses Linux or the like shouldn't need instructions, but if someone disagrees, they could be included as well if you write them. That's all for now. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Greg Newby" To: "Project Gutenberg Volunteer Discussion" Cc: "Project Gutenberg CDs" Sent: Tuesday, July 04, 2006 3:44 AM Subject: [gutvol-d] New DVD ISO to try > Thanks to everyone who provided feedback and ideas for the new DVD > image. I've made a new image that contains *all* of the plain text > titles (zipped), plus a bunch of multimedia and some nice HTML with > images. > > Feedback welcome: > > http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special-work/ > > my notes on what's included: > > http://snowy.arsc.alaska.edu/gbn/pgimages/newdvd.txt > > As you will read at the first URL, I went ahead and included > lots of our copyrighted content. Then, I said that the DVD > could be given away, but NOT sold. I like this. > > Enjoy! > -- Greg > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFEqn20I7J99hVZuJcRAj7JAJ0SAru+IMO+NrLX4aXe1lvq4svVNACfQEI1 yURPmloPbGZeKGXQEMR1zzY= =cqkK -----END PGP SIGNATURE----- From kouhia at nic.funet.fi Tue Jul 4 08:15:31 2006 From: kouhia at nic.funet.fi (Juhana Sadeharju) Date: Tue Jul 4 08:47:50 2006 Subject: [gutvol-d] Copyright question Message-ID: Hello. Most often I hear that the copyright of the book lasts 80 years after the death of author. But it is normal that the copyright is transferred to the publisher in the contract. Then why the copyright expiration is still tied to the author who don't have the copyright anymore? Is this misuse of copyright law? Should author keep the copyright (and publisher only license) so that the death+80 rule applies? That is most convenient to publishers, of course, because they get the copyright and its expiration is still tied to the author. In the example case, the book writing contract was made 8 years ago and the contract included the second edition published now. Because the publisher owns the copyright of the second edition already due the contract, the author has never owned the copyright. So how in this case the copyright expiration could never be tied to the author? Juhana From sly at victoria.tc.ca Tue Jul 4 09:16:35 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Jul 4 09:16:37 2006 Subject: [gutvol-d] Copyright question In-Reply-To: References: Message-ID: Copyright laws are different in every country. I know that in Canada, the duration of copyright is determined by the life-span of the creator, regardless of who actually owns the copyright. I cannot speak for any other countries. You are unlikely to find a useful answer here on the Project Gutenberg Volunteer Discussion list. For a list dedicated to discussing copyright issues, see: http://www.cni.org/forums/cni-copyright/ Andrew On Tue, 4 Jul 2006, Juhana Sadeharju wrote: > > Hello. Most often I hear that the copyright of the book lasts 80 years > after the death of author. But it is normal that the copyright is > transferred to the publisher in the contract. Then why the copyright > expiration is still tied to the author who don't have the copyright > anymore? Is this misuse of copyright law? Should author keep the > copyright (and publisher only license) so that the death+80 rule > applies? > > That is most convenient to publishers, of course, because they > get the copyright and its expiration is still tied to the author. > > In the example case, the book writing contract was made 8 years > ago and the contract included the second edition published now. > Because the publisher owns the copyright of the second edition > already due the contract, the author has never owned the copyright. > So how in this case the copyright expiration could never be tied to > the author? > > Juhana > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From Bowerbird at aol.com Tue Jul 4 11:11:07 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Jul 4 11:11:15 2006 Subject: [gutvol-d] New DVD ISO to try Message-ID: greg said: > http://snowy.arsc.alaska.edu/gbn/pgimages/newdvd.txt thanks, greg, this document is _extremely_ informative. have a happy holiday... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060704/57646858/attachment.html From gbnewby at pglaf.org Tue Jul 4 12:37:05 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Tue Jul 4 12:37:06 2006 Subject: [gutvol-d] New DVD ISO to try In-Reply-To: References: Message-ID: <20060704193705.GD26049@pglaf.org> On Tue, Jul 04, 2006 at 02:11:07PM -0400, Bowerbird@aol.com wrote: > greg said: > > http://snowy.arsc.alaska.edu/gbn/pgimages/newdvd.txt > > thanks, greg, this document is _extremely_ informative. That's the part that looks like it should be easy, but actually took me about 25 hours! I also have the list of "best of" titles, which took a long time. It's in the same location: http://snowy.arsc.alaska.edu/gbn/pgimages/ -- Greg From Bowerbird at aol.com Tue Jul 4 16:24:30 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Tue Jul 4 16:24:44 2006 Subject: [gutvol-d] New DVD ISO to try Message-ID: <4ba.35e4a06.31dc52ae@aol.com> greg said: > That's the part that looks like it should be easy, > but actually took me about 25 hours!? oh yeah, i'm well aware that it's harder than it looks. i'll have a few questions for you about it, tomorrow... for today, happy birthday to project gutenberg! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060704/a14a65c4/attachment.html From desrod at gnu-designs.com Tue Jul 4 14:43:15 2006 From: desrod at gnu-designs.com (David A. Desrosiers) Date: Tue Jul 4 18:43:32 2006 Subject: [gutvol-d] ftp.archive.org In-Reply-To: <7.0.1.0.2.20060702235033.03360330@baechler.net> References: <7.0.1.0.2.20060626114129.032ee4e0@baechler.net> <44A04DB1.6070608@aol.com> <7.0.1.0.2.20060628003711.03fdd800@baechler.net> <7.0.1.0.2.20060630005304.03354d60@baechler.net> <44A5B5A3.1010703@aol.com> <7.0.1.0.2.20060702235033.03360330@baechler.net> Message-ID: <1152049396.16394.6.camel@localhost.localdomain> On Sun, 2006-07-02 at 23:58 -0700, Tony Baechler wrote: > If rsync will grab all the files I need and put > them in my customized structure, rsync will help. However, based on > my reading of the help, it doesn't do that. In fact, it does exactly that, depending on how you tell it where to put the files its copying over. You may have to script a bit of it or run several rsync commands in a series to get what you want (fetch text first, indices next and so on). rsync -avSP --delete *[0-9].txt /my/custom/directory rsync -avSP --delete *.gz /other/place ...and so on. I've been using rsync for MANY years now, and tridge is one of the alumnus' of a previous company we both worked for (Linuxcare), so I can say with absolute-certainty, that if you're running into trouble with what rsync does for you, you're doing something wrong ;) If there's one thing rsync does well, its everything. There are even people out there who use rsync _exclusively_ as their MTA/MDA. Nutty, but true. > It would be perfect for mirroring a primary server to a backup > machine. Again, I'm not trying to mirror here so it doesn't do what I > want based on my understanding of comments made on this list and the > rsync help. It can do a lot of things, incremental backups, snapshots, mirroring, cloning of directories, complete transposition... pretty much anything you want. It just takes a remote file, block-copies it to some local place (or local to local, if you're cloning a drive for example. I've used rsync quite a bit to upgrade hard drives in laptops, works great). In any case, just define your schema and apply the rsync methodology to it. No need to get complicated or fancy. Oh, and lastly... rsync does NOT have to have the same version running on both ends. If that were true, it would break in thousands of situations. You simply have to have a version which understands the options you're passing it (i.e. rsync v1.x isn't future-compatible with v2.6.6). I can't speak for the Windows <-> Linux syncronization, but it should be moot. You don't need to run rsyncd to rsync files from machine to machine either, but you can if you wish. Good luck! -- David A. Desrosiers desrod gnu-designs com http://gnu-designs.com "Erosion of civil liberties... is a threat to national security." From rnmscott at netspace.net.au Wed Jul 5 06:18:02 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 06:18:08 2006 Subject: [gutvol-d] Re: Automated readability scores for PG eBooks Message-ID: <1152105482.44abbc0abcbbd@webmail.netspace.net.au> Interesting idea, Greg. Amazon has this for some texts, via another method (or two, I think, the names escape me currently, Flesch something or other?) or two, and I think there are cpan modules that do these. A GUTREADABILITY file from these would be interesting/fun. Someone would get an interesting academic project or two out of that perhaps, as well? Riohard ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From rnmscott at netspace.net.au Wed Jul 5 06:24:27 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 06:24:31 2006 Subject: [gutvol-d] Re: Automated readability scores for PG eBooks In-Reply-To: <1152105482.44abbc0abcbbd@webmail.netspace.net.au> References: <1152105482.44abbc0abcbbd@webmail.netspace.net.au> Message-ID: <1152105867.44abbd8ba3b73@webmail.netspace.net.au> greg said: > one value of this is that it does > a good job of identifing children's eBooks > (they tend to be "easy"). checklist said: > bigword density > short word density (-) > wordsPerSentences > syllablesPerWords > profainwordsPerWords > numbersPerWords > mostCommon1000WordsPerWord (-) > commascharsPerWords > wordsPerParagraphs > letterFrequencyDistributionError > adjacentLetterPairsFrequencyDistributionError > uniqueStemmedWordsPerWord; aren't scientists silly? :+) look, greg, if you want a list of children's e-books, or a list of "easy" e-books, or any kind of list of books, just ask the distributed proofreaders people for the list... -- If I ask them to classify 20000 books for me, will I get a reply any time this century? :-) they'll give you a long list of books, any kind of list you want, and you won't have to do one little bit of fancy-ass statistics... i'm serious, they can give a list with p.g. e-text numbers and meaningful notes, and funny little stories, and _everything_... much more vivid than your boring-ass statistics... :+) -- and anecdotes about 'hey I remember that really weird typo on page 263' and 'I picked that up at Fred's garage sale on Smith street' are very exciting? ;-) ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From rnmscott at netspace.net.au Wed Jul 5 06:10:22 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 06:29:45 2006 Subject: [gutvol-d] Re: 'Lasker's Manual of Chess' Message-ID: <1152105022.44abba3e4d275@webmail.netspace.net.au> Interesting idea. I had never even thought of chess works, despite having actually read this, way back when, I think. How would you do it, with images? Some of them could be pretty big, with lots of board positions. Re-doing them as ascii boards like on old chess servers wouldn't be too much fun, but possible? Richard ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From rnmscott at netspace.net.au Wed Jul 5 06:45:11 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 06:45:15 2006 Subject: [gutvol-d] Re: Automated readability scores for PG eBooks In-Reply-To: <1152105867.44abbd8ba3b73@webmail.netspace.net.au> References: <1152105482.44abbc0abcbbd@webmail.netspace.net.au> <1152105867.44abbd8ba3b73@webmail.netspace.net.au> Message-ID: <1152107111.44abc267d5099@webmail.netspace.net.au> On 6/26/06, Scott Lawton wrote: > While I agree that it would not be worth adding readability score if it had much > impact on these and other worthy goals, But if it doesn't, then those goals aren't reasons _for_ adding it. > There are lots and lots of cool things that could be done with the catalog. We could start with the results of stripping the header and running wc on it. That strikes me at least as useful as this result. Also, the ten or twelve most common words in the book after stripping the ten or twelve most common words in the English language. -- I'd like to see that too, word count is perhaps a little more meaningful to your average reader than size in kilobytes (which is displayed? and useful to know as well, of course) > Even in the context of the above, the scores would provide a great starting point for > being improved with manual cataloging and literacy labeling. I don't think so. It's downright useless for manual cataloging, as it only handles that one dimension. I don't think it will help literacy labeling much, either, which is best done manually. -- Certainly wouldn't be useless, if you are going to catalogue/tag things manually. > Don't let the perfect stand in the way of the good. But I don't think having these numbers anywhere prominent is good. Right now our pages only have a few pieces of important information; minutia like this should go to a page linked to a page linked only from the book page, which we can fill with various stats to our hearts content. -- Shouldn't be next to the title, but an index page of all of them would be cool. As the texts stand now lots of them have pages of mind numbing legalese, already, not sure two lines of numbers matter in all of that. It also seems a little weird to have some proprietary reading level numbers on the system, instead of the Fog index or the Flesch-Kincaid Readability tests. It feels like an advertisement. -- That was the two I was trying to think of before, thanks! :) This is clipped from the 'Voyages of Doctor Dolittle' Readability Compared with books in All Categories Fog Index: 8.3 16% are easier 84% are harder Flesch Index: 74.6 11% are easier 89% are harder Flesch-Kincaid Index: 6.5 17% are easier 83% are harder Complexity (learn more) Complex Words: 6% 8% have fewer 92% have more Syllables per Word: 1.4 7% have fewer 93% have more Words per Sentence: 14.8 39% have fewer 61% have more Number of Characters: 387,512 47% have fewer 53% have more Words: 72,671 55% have fewer 45% have more Sentences: 4,912 64% have fewer 36% have more I suppose with wc etc. you could have percentiles of book by 'length' etc. which are relevant to people, too. Richard ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From jon.ingram at gmail.com Wed Jul 5 08:10:59 2006 From: jon.ingram at gmail.com (Jon Ingram) Date: Wed Jul 5 08:18:06 2006 Subject: [gutvol-d] Re: 'Lasker's Manual of Chess' In-Reply-To: <1152105022.44abba3e4d275@webmail.netspace.net.au> References: <1152105022.44abba3e4d275@webmail.netspace.net.au> Message-ID: <4baf53720607050810s3f4ad3b8ud15615f88e2ccf31@mail.gmail.com> On 7/5/06, rnmscott@netspace.net.au wrote: > Interesting idea. I had never even thought of chess works, despite having > actually read this, way back when, I think. > > How would you do it, with images? Some of them could be pretty big, with lots > of board positions. Re-doing them as ascii boards like on old chess servers > wouldn't be too much fun, but possible? Symbols for all the chess pieces are in Unicode (see http://www.unicode.org/charts/PDF/U2600.pdf ), but I don't image the glyphs are in all that many fonts! Having lots of images isn't that big a problem, especially if the images are only black-and-white. -- Jon Ingram From jon at noring.name Wed Jul 5 08:27:44 2006 From: jon at noring.name (Jon Noring) Date: Wed Jul 5 08:27:59 2006 Subject: [gutvol-d] Re: 'Lasker's Manual of Chess' In-Reply-To: <4baf53720607050810s3f4ad3b8ud15615f88e2ccf31@mail.gmail.com> References: <1152105022.44abba3e4d275@webmail.netspace.net.au> <4baf53720607050810s3f4ad3b8ud15615f88e2ccf31@mail.gmail.com> Message-ID: <1256093780.20060705092744@noring.name> Jon Ingram wrote: > rnmscott@netspace.net.au wrote: >> Interesting idea. I had never even thought of chess works, despite having >> actually read this, way back when, I think. >> >> How would you do it, with images? Some of them could be pretty big, with lots >> of board positions. Re-doing them as ascii boards like on old chess servers >> wouldn't be too much fun, but possible? > Symbols for all the chess pieces are in Unicode (see > http://www.unicode.org/charts/PDF/U2600.pdf > ), but I don't image the glyphs are in all that many fonts! > > Having lots of images isn't that big a problem, especially if the > images are only black-and-white. Another approach to consider, and with any highly formatted textual objects where "layout is content" [note], is to use SVG to represent the chess board positions. With animated SVG, one should even be able to show the move-by-move board positions. SVG rendering engines are getting to be ubiquitous. The Mozilla engine includes support for some flavor of SVG. Jon [note: such things as ultra-complex tables, poetry and prose where the position of the text itself communicates content, etc., are types of content amenable to representation using SVG.] From sly at victoria.tc.ca Wed Jul 5 09:05:56 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jul 5 09:05:59 2006 Subject: [gutvol-d] Re: 'Lasker's Manual of Chess' In-Reply-To: <1152105022.44abba3e4d275@webmail.netspace.net.au> References: <1152105022.44abba3e4d275@webmail.netspace.net.au> Message-ID: Hmmm..... ok summary of ideas so far, and one more of my own. 1) As you mention, you could do the positions using just ascii characters. (a little tedious to do, but perhaps the most portable.) 2) Similar to above, only using high-unicode codepoints for chess pieces. Good point: standards compliant. Drawback: at present, not very many people could view it correctly. 3) You could extract images of each board position from page scans, and create an html file. 4) Using your own software, or what-have-you, you could create new images showing the same positions. (Probably result in cleaner images this way.) 5) Jon Noring mentioned using SVG, which I wouldn't have thought of. Investigate at your pleasure. 6) I'm sure I've seen somewhere in PG some use of PGN for recording chess games. It might be of use in this case. See: http://en.wikipedia.org/wiki/Portable_Game_Notation Andrew On Wed, 5 Jul 2006 rnmscott@netspace.net.au wrote: > Interesting idea. I had never even thought of chess works, despite having > actually read this, way back when, I think. > > How would you do it, with images? Some of them could be pretty big, with lots > of board positions. Re-doing them as ascii boards like on old chess servers > wouldn't be too much fun, but possible? > > Richard From slybarger at gmail.com Wed Jul 5 09:35:27 2006 From: slybarger at gmail.com (Suzanne Lybarger) Date: Wed Jul 5 09:35:31 2006 Subject: [gutvol-d] Re: 'Lasker's Manual of Chess' In-Reply-To: References: <1152105022.44abba3e4d275@webmail.netspace.net.au> Message-ID: <72f95c520607050935t7ed5b9d7r417abe01da2bf489@mail.gmail.com> On 7/5/06, Andrew Sly wrote: > 6) I'm sure I've seen somewhere in PG some use of PGN > for recording chess games. It might be of use in this > case. See: http://en.wikipedia.org/wiki/Portable_Game_Notation Yep! Here is The Blue Book of Chess, post-processed by Peter Barozzi at DP: http://www.gutenberg.org/etext/16377 He used PGN and provided modern notation for all the games in both file types. We created ascii boards for the text version in proofing, and he separately generated the illustrations of the boards for the HTML. Cheers, Suzanne ======================================= = Project Gutenberg's Distributed Proofreaders = Preserving History One Page at a Time. http:///www.pgdp.net ======================================= From gbnewby at pglaf.org Wed Jul 5 17:20:28 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Wed Jul 5 17:20:29 2006 Subject: [gutvol-d] Pls. test worldebookfair.com Message-ID: <20060706002028.GB18396@pglaf.org> http://www.worldebookfair.com It was on an overloaded network connection earlier, but we moved it this (Wednesday) morning and the site seems to be performing well. Take a look - it's pretty neat! There are a few missing files & broken links, but for the most part things seem OK. -- Greg From rnmscott at netspace.net.au Wed Jul 5 19:46:14 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 19:46:20 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <20060706002028.GB18396@pglaf.org> References: <20060706002028.GB18396@pglaf.org> Message-ID: <1152153974.44ac7976d0348@webmail.netspace.net.au> Still seems to be really slow, pretty much the same as yesterday (the first time I looked). I am getting < 1k download a lot of the time. Quoting Greg Newby : > http://www.worldebookfair.com > > It was on an overloaded network connection earlier, but > we moved it this (Wednesday) morning and the site seems to be > performing well. > > Take a look - it's pretty neat! > ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From brad at chenla.org Wed Jul 5 21:24:45 2006 From: brad at chenla.org (Brad Collins) Date: Wed Jul 5 21:22:06 2006 Subject: [gutvol-d] Conference Paper for Electronic Library Markup Language Message-ID: I while back I mentioned that I would be presenting a paper at the Extreme Markup Language Conference in Montreal in August. The paper is an introduction to BMF (The Burr Metadata Framework) which is a monster markup language which pulls together concepts from the FRBR and Z39.19 (NISO Standard for Monolingual Thesuri) and draws on many concepts from TEI. BMF is designed to provide a framework for building distributed electronic libraries which can be annotated and extended by anyone. A few people expressed interest in reading the paper but I lost the list. So if anyone wants to read the paper drop me a note. It's about 80K zipped. Cheers, b/ -- Brad Collins , Banqwao, Thailand From joey at joeysmith.com Wed Jul 5 21:33:35 2006 From: joey at joeysmith.com (joey) Date: Wed Jul 5 21:34:42 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <1152153974.44ac7976d0348@webmail.netspace.net.au> References: <20060706002028.GB18396@pglaf.org> <1152153974.44ac7976d0348@webmail.netspace.net.au> Message-ID: <20060706043334.GC20863@joeysmith.com> Can you check which IP address your machine resolves www.worldebookfair.com to? I'm getting 2Mb/s from 208.99.202.194 (the readingroo.ms server). Perhaps your DNS simply hasn't updated, or maybe there's congestion between you and readingroo.ms, but I'd like to know before I try adding some of the rate limiting stuff Greg has asked me to look into. On Thu, Jul 06, 2006 at 12:46:14PM +1000, rnmscott@netspace.net.au wrote: > Still seems to be really slow, pretty much the same as yesterday (the first > time I looked). I am getting < 1k download a lot of the time. > > > Quoting Greg Newby : > > > http://www.worldebookfair.com > > > > It was on an overloaded network connection earlier, but > > we moved it this (Wednesday) morning and the site seems to be > > performing well. > > > > Take a look - it's pretty neat! > > > From rnmscott at netspace.net.au Wed Jul 5 21:40:28 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Wed Jul 5 21:40:33 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <20060706043334.GC20863@joeysmith.com> References: <20060706002028.GB18396@pglaf.org> <1152153974.44ac7976d0348@webmail.netspace.net.au> <20060706043334.GC20863@joeysmith.com> Message-ID: <1152160828.44ac943c29349@webmail.netspace.net.au> PING www.worldebookfair.com (72.235.235.66) 56(84) bytes of data 64 bytes from 72.235.235.66: icmp_seq=1 ttl=110 time=3769 ms 64 bytes from 72.235.235.66: icmp_seq=2 ttl=110 time=3566 ms 64 bytes from 72.235.235.66: icmp_seq=3 ttl=110 time=3628 ms 64 bytes from 72.235.235.66: icmp_seq=4 ttl=110 time=3427 ms 64 bytes from 72.235.235.66: icmp_seq=5 ttl=110 time=3501 ms Quoting joey : > Can you check which IP address your machine resolves www.worldebookfair.com > to? I'm getting 2Mb/s from 208.99.202.194 (the readingroo.ms server). Perhaps > your DNS simply hasn't updated, or maybe there's congestion between you and > readingroo.ms, but I'd like to know before I try adding some of the rate > limiting stuff Greg has asked me to look into. > ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From joey at joeysmith.com Wed Jul 5 22:20:27 2006 From: joey at joeysmith.com (joey) Date: Wed Jul 5 22:21:32 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <1152160828.44ac943c29349@webmail.netspace.net.au> References: <20060706002028.GB18396@pglaf.org> <1152153974.44ac7976d0348@webmail.netspace.net.au> <20060706043334.GC20863@joeysmith.com> <1152160828.44ac943c29349@webmail.netspace.net.au> Message-ID: <20060706052027.GE20863@joeysmith.com> You're still getting a connection to the old server. The new server is 208.99.202.194 On Thu, Jul 06, 2006 at 02:40:28PM +1000, rnmscott@netspace.net.au wrote: > > PING www.worldebookfair.com (72.235.235.66) 56(84) bytes of data > 64 bytes from 72.235.235.66: icmp_seq=1 ttl=110 time=3769 ms > 64 bytes from 72.235.235.66: icmp_seq=2 ttl=110 time=3566 ms > 64 bytes from 72.235.235.66: icmp_seq=3 ttl=110 time=3628 ms > 64 bytes from 72.235.235.66: icmp_seq=4 ttl=110 time=3427 ms > 64 bytes from 72.235.235.66: icmp_seq=5 ttl=110 time=3501 ms > > Quoting joey : > > > Can you check which IP address your machine resolves www.worldebookfair.com > > to? I'm getting 2Mb/s from 208.99.202.194 (the readingroo.ms server). > Perhaps > > your DNS simply hasn't updated, or maybe there's congestion between you and > > readingroo.ms, but I'd like to know before I try adding some of the rate > > limiting stuff Greg has asked me to look into. > > From gbnewby at pglaf.org Wed Jul 5 23:31:54 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Wed Jul 5 23:31:56 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <20060706052027.GE20863@joeysmith.com> References: <20060706002028.GB18396@pglaf.org> <1152153974.44ac7976d0348@webmail.netspace.net.au> <20060706043334.GC20863@joeysmith.com> <1152160828.44ac943c29349@webmail.netspace.net.au> <20060706052027.GE20863@joeysmith.com> Message-ID: <20060706063154.GB24389@pglaf.org> On Wed, Jul 05, 2006 at 11:20:27PM -0600, joey wrote: > You're still getting a connection to the old server. The new server > is 208.99.202.194 Right. I changed the network TTL (the time before a cached IP address expires) from 1 day to 1 hour, so further changes will propagate faster. Depending on what network connection you're using, you might be able to force a cache reload (rebooting your system often works). During my testing earlier, I had it pushing 50Mbp/s. It's been averaging about 8Mbps all day. -- Greg > On Thu, Jul 06, 2006 at 02:40:28PM +1000, rnmscott@netspace.net.au wrote: > > > > PING www.worldebookfair.com (72.235.235.66) 56(84) bytes of data > > 64 bytes from 72.235.235.66: icmp_seq=1 ttl=110 time=3769 ms > > 64 bytes from 72.235.235.66: icmp_seq=2 ttl=110 time=3566 ms > > 64 bytes from 72.235.235.66: icmp_seq=3 ttl=110 time=3628 ms > > 64 bytes from 72.235.235.66: icmp_seq=4 ttl=110 time=3427 ms > > 64 bytes from 72.235.235.66: icmp_seq=5 ttl=110 time=3501 ms > > > > Quoting joey : > > > > > Can you check which IP address your machine resolves www.worldebookfair.com > > > to? I'm getting 2Mb/s from 208.99.202.194 (the readingroo.ms server). > > Perhaps > > > your DNS simply hasn't updated, or maybe there's congestion between you and > > > readingroo.ms, but I'd like to know before I try adding some of the rate > > > limiting stuff Greg has asked me to look into. > > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From sly at victoria.tc.ca Wed Jul 5 23:34:14 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jul 5 23:34:18 2006 Subject: [gutvol-d] Looking for feedback on color image files in html book. Message-ID: I have an 1875 picture book to prepare for PG, with 36 pages of color illustrations. There is not much text to go with these illustrations. (The first section is an alphabet, followed by a couple fairy-tale-type stories based on nursery rhymes.) The scans I've made are 8mb each, and when I crop them and convert to png, I can get each one down to just over a megabyte. However, from reading the PG guidelines and looking at a number of other example PG postings, this is still much too large for the purpose of an easily downloadable html file. So, what seems to make sense is to scale down and make some jpg images that would fit better. It would be nice though, to have somewhere to preserve the high-resolution images, even if not at PG. You can see a rough draft, with the first six images included, at: http://www.victoria.tc.ca/~sly/pb.htm Any comments would be welcome... Andrew From jon.ingram at gmail.com Thu Jul 6 00:31:03 2006 From: jon.ingram at gmail.com (Jon Ingram) Date: Thu Jul 6 00:31:05 2006 Subject: [gutvol-d] Looking for feedback on color image files in html book. In-Reply-To: References: Message-ID: <4baf53720607060031s7f5fbda2md6a23cb9abd4c18b@mail.gmail.com> On 7/6/06, Andrew Sly wrote: > > I have an 1875 picture book to prepare for PG, with 36 > pages of color illustrations. There is not much text > to go with these illustrations. (The first section is > an alphabet, followed by a couple fairy-tale-type > stories based on nursery rhymes.) > > The scans I've made are 8mb each, and when I crop them > and convert to png, I can get each one down to just over > a megabyte. However, from reading the PG guidelines and > looking at a number of other example PG postings, this > is still much too large for the purpose of an easily > downloadable html file. So, what seems to make sense > is to scale down and make some jpg images that would > fit better. > > It would be nice though, to have somewhere to preserve > the high-resolution images, even if not at PG. > > You can see a rough draft, with the first six images included, > at: http://www.victoria.tc.ca/~sly/pb.htm You can display the lower resolution versions in the main page, and link each image to a high resolution version. We do a similar thing with the illustrated magazines we put through DP. -- Jon Ingram From tony at baechler.net Thu Jul 6 00:53:54 2006 From: tony at baechler.net (Tony Baechler) Date: Thu Jul 6 00:53:39 2006 Subject: [gutvol-d] New DVD ISO to try In-Reply-To: <000201c69f77$af35a190$0132a8c0@blackbox> References: <20060626093237.GA27369@pglaf.org> <7.0.1.0.2.20060628084333.0426a7b0@baechler.net> <20060704084420.GA11229@pglaf.org> <000201c69f77$af35a190$0132a8c0@blackbox> Message-ID: <7.0.1.0.2.20060706004856.033c04a0@baechler.net> Hi, I'm really surprised at this comment. I admit that huge html pages take some time to load into the buffer, but I've never had a problem with them regardless of size in most cases. For older systems, I recommend Lynx for DOS or Linux. It is text-based but that shouldn't impose a problem. It has a free license so binaries could be distributed on the DVD. As far as graphical browsers, again I've never had a problem with huge html pages regardless of size and screen reader. I'm not sure that older systems will have a problem either. I'll have to actually look at the title index to be sure but I don't think a large html file should be considered a problem. I use Window-Eyes 5.5. You can contact me off list if you want since I'm not sure how your screen reader, at least nowadays, could be an issue. If it was several years ago, I would agree with the screen reader issue. At 09:39 AM 7/4/06 -0500, you wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >This looks great. One thing you could change would be to redo the title >index and distribute all of the "the" titles to their proper places in the >lists. The reason is that the t index for titles is huge. Either that or >split it. HTML files that are too large cause problems for my screen >reader, and I imagine that they might for some older systems as well, but I >could be wrong. -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 7/4/06 From tony at baechler.net Thu Jul 6 01:03:46 2006 From: tony at baechler.net (Tony Baechler) Date: Thu Jul 6 01:03:31 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <20060706002028.GB18396@pglaf.org> References: <20060706002028.GB18396@pglaf.org> Message-ID: <7.0.1.0.2.20060706005451.033c4650@baechler.net> Yes, I noticed the slowness. It seems much better now. I have a question though. It says that you can download all ebooks. I don't care about many of them but I would like to grab at least a few hundred if not a few thousands. How? Do I really have to individually download every single pdf file by hand? I don't expect a nice ftp/rsync/http directory listing, but it would at least be nice if all the titles from a certain collection could be on one search page. If you use the search form, you only get the first 10 results. The "browse collections" page only shows a random sampling of titles from any particular collection. Also, the numbers are wrong or I'm doing something wrong. One figure shows about 250,000 pdf files, another shows 330,000 depending on whether you search or not. The Census page shows about 30,000 pdf files but the search shows about 52,000. I tried the advanced search but that seems to only be a help document unless I did something wrong. I freely admit that I'm missing something here. What am I missing? Should I be using a more specific search syntax to get what I want, i.e. all books from one collection on a page? Is there a way to show 50 results instead of 10? Also, what about the missing files? I looked at some rocketry links on the NASA collection and got error 404. Where are they? Other than security, is there any reason to not allow raw directory lists? That would make downloading much easier. With the Baen books, how do I find titles? The page only lists ISBNs and authors but no titles except for mp3 samples. Finally, are any of these going to eventually make it to the main PG site? Some are public domain and there is no reason why they can't be part of PG except for possible layout and pdf issues. At 05:20 PM 7/5/06 -0700, you wrote: >http://www.worldebookfair.com > >It was on an overloaded network connection earlier, but >we moved it this (Wednesday) morning and the site seems to be >performing well. > >Take a look - it's pretty neat! > >There are a few missing files & broken links, but >for the most part things seem OK. > -- Greg -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 7/4/06 From cannona at fireantproductions.com Thu Jul 6 05:52:34 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Jul 6 05:52:57 2006 Subject: [gutvol-d] New DVD ISO to try References: <20060626093237.GA27369@pglaf.org><7.0.1.0.2.20060628084333.0426a7b0@baechler.net><20060704084420.GA11229@pglaf.org><000201c69f77$af35a190$0132a8c0@blackbox> <7.0.1.0.2.20060706004856.033c04a0@baechler.net> Message-ID: <000c01c6a0fb$1b23e310$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Interesting. Perhaps it's a Jaws thing, as it's done so on many systems over the years. By the way, it's the T title list. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Tony Baechler" To: "Project Gutenberg Volunteer Discussion" Sent: Thursday, July 06, 2006 2:53 AM Subject: Re: [gutvol-d] New DVD ISO to try > > > Hi, I'm really surprised at this comment. I admit that huge html pages > take some time to load into the buffer, but I've never had a problem with > them regardless of size in most cases. For older systems, I recommend > Lynx for DOS or Linux. It is text-based but that shouldn't impose a > problem. It has a free license so binaries could be distributed on the > DVD. As far as graphical browsers, again I've never had a problem with > huge html pages regardless of size and screen reader. I'm not sure that > older systems will have a problem either. I'll have to actually look at > the title index to be sure but I don't think a large html file should be > considered a problem. I use Window-Eyes 5.5. You can contact me off list > if you want since I'm not sure how your screen reader, at least nowadays, > could be an issue. If it was several years ago, I would agree with the > screen reader issue. > > At 09:39 AM 7/4/06 -0500, you wrote: >>-----BEGIN PGP SIGNED MESSAGE----- >>Hash: SHA1 >> >>This looks great. One thing you could change would be to redo the title >>index and distribute all of the "the" titles to their proper places in the >>lists. The reason is that the t index for titles is huge. Either that or >>split it. HTML files that are too large cause problems for my screen >>reader, and I imagine that they might for some older systems as well, but >>I >>could be wrong. > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 7/4/06 > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFErQepI7J99hVZuJcRAtdFAKCuueilBp8JK4BdD8NolCn212tNRACgnjZR eBXfuMq+L50Q4JRfBwqwpfA= =Kc6G -----END PGP SIGNATURE----- From gbnewby at pglaf.org Thu Jul 6 09:39:24 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jul 6 09:39:25 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <7.0.1.0.2.20060706005451.033c4650@baechler.net> References: <20060706002028.GB18396@pglaf.org> <7.0.1.0.2.20060706005451.033c4650@baechler.net> Message-ID: <20060706163924.GB1852@pglaf.org> On Thu, Jul 06, 2006 at 01:03:46AM -0700, Tony Baechler wrote: > Yes, I noticed the slowness. It seems much better now. I have a > question though. It says that you can download all ebooks. I don't > care about many of them but I would like to grab at least a few > hundred if not a few thousands. How? Do I really have to Tony, please send WEF questions directly to John, cc'd or John Guagliardo Most of the collections (though not all) do not have access except via search. > individually download every single pdf file by hand? I don't expect > a nice ftp/rsync/http directory listing, but it would at least be > nice if all the titles from a certain collection could be on one > search page. If you use the search form, you only get the first 10 > results. The "browse collections" page only shows a random sampling > of titles from any particular collection. Also, the numbers are > wrong or I'm doing something wrong. One figure shows about 250,000 > pdf files, another shows 330,000 depending on whether you search or > not. The Census page shows about 30,000 pdf files but the search > shows about 52,000. I tried the advanced search but that seems to > only be a help document unless I did something wrong. There are about 330,000 files; we talk about 250,000 (1/4 million) to take overlap into account. I don't know about the Census docs, You're right that "Advanced Search" really just gives help. > I freely admit that I'm missing something here. What am I > missing? Should I be using a more specific search syntax to get what > I want, i.e. all books from one collection on a page? Is there a way > to show 50 results instead of 10? Also, what about the missing > files? I looked at some rocketry links on the NASA collection and > got error 404. Where are they? Other than security, is there any There are two main sources of 404s now: 1) Some files are case-sensitive (many came from a Windoze system). We're working on this. 2) A few collections are still being loaded into the different servers. We're working on this, too. You can email John or me specific failed filenames, and I can try to locate them. That's something I can do. > reason to not allow raw directory lists? That would make downloading > much easier. With the Baen books, how do I find titles? The page > only lists ISBNs and authors but no titles except for mp3 > samples. Finally, are any of these going to eventually make it to > the main PG site? Some are public domain and there is no reason why > they can't be part of PG except for possible layout and pdf issues. Nothing will make it to the main PG site without "someone" doing the work! But most of the public domain content is already on pgcc.net , so it's not going to go away after August 4. I don't know about directory listings etc., those are questions for John. -- Greg > At 05:20 PM 7/5/06 -0700, you wrote: > >http://www.worldebookfair.com > > > >It was on an overloaded network connection earlier, but > >we moved it this (Wednesday) morning and the site seems to be > >performing well. > > > >Take a look - it's pretty neat! > > > >There are a few missing files & broken links, but > >for the most part things seem OK. > > -- Greg > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 7/4/06 > From gbnewby at pglaf.org Thu Jul 6 09:44:46 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jul 6 09:44:47 2006 Subject: [gutvol-d] New DVD ISO to try In-Reply-To: <7.0.1.0.2.20060706004856.033c04a0@baechler.net> References: <20060626093237.GA27369@pglaf.org> <7.0.1.0.2.20060628084333.0426a7b0@baechler.net> <20060704084420.GA11229@pglaf.org> <000201c69f77$af35a190$0132a8c0@blackbox> <7.0.1.0.2.20060706004856.033c04a0@baechler.net> Message-ID: <20060706164446.GD1852@pglaf.org> 1) Yes, I'll have a few different indexes, so they're not so uneven in size. Also a "whole DVD" listing. 2) "The" as the first word in the title is an artifact of the back-end catalog. These basically need to be fixed by hand. (Yes, I'm sure some could be automated.... Marcello would like to hear your thoughts on this, I'm certain.) I should have the "final" version up within 24 hours. I said that yesterday, then we had a big rainstorm and (maybe coincidentally) my 'net went out. -- Greg On Thu, Jul 06, 2006 at 12:53:54AM -0700, Tony Baechler wrote: > > > Hi, I'm really surprised at this comment. I admit that huge html > pages take some time to load into the buffer, but I've never had a > problem with them regardless of size in most cases. For older > systems, I recommend Lynx for DOS or Linux. It is text-based but > that shouldn't impose a problem. It has a free license so binaries > could be distributed on the DVD. As far as graphical browsers, again > I've never had a problem with huge html pages regardless of size and > screen reader. I'm not sure that older systems will have a problem > either. I'll have to actually look at the title index to be sure but > I don't think a large html file should be considered a problem. I > use Window-Eyes 5.5. You can contact me off list if you want since > I'm not sure how your screen reader, at least nowadays, could be an > issue. If it was several years ago, I would agree with the screen > reader issue. > > At 09:39 AM 7/4/06 -0500, you wrote: > >-----BEGIN PGP SIGNED MESSAGE----- > >Hash: SHA1 > > > >This looks great. One thing you could change would be to redo the title > >index and distribute all of the "the" titles to their proper places in the > >lists. The reason is that the t index for titles is huge. Either that or > >split it. HTML files that are too large cause problems for my screen > >reader, and I imagine that they might for some older systems as well, but I > >could be wrong. > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.1.394 / Virus Database: 268.9.9/382 - Release Date: 7/4/06 > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Thu Jul 6 09:56:20 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jul 6 09:56:22 2006 Subject: [gutvol-d] Showcase eBook #18131 Message-ID: <20060706165620.GA2836@pglaf.org> I have been perusing our collection for interesting stuff to put on the new DVD image ("interesting" being as broadly defined as possible). Here's one that is pretty new, but I didn't notice it earlier. It includes MIDI files and sheet music, as part of an eBook. It's a great example, to me, of the type of thing you can do with a computer-based eBook, but not so easily with plain old paper. The Rescue of the Princess Winsome, by Fellows-Johnston and Bacon 18131 http://www.gutenberg.org/etext/18131 (just view the HTML file, it links to the rest) Wow! -- Greg From ajhaines at shaw.ca Thu Jul 6 10:19:09 2006 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Thu Jul 6 10:21:39 2006 Subject: [gutvol-d] Showcase eBook #18131 References: <20060706165620.GA2836@pglaf.org> Message-ID: <000501c6a120$4b0df870$6401a8c0@ahainesp2400> There seems to be a problem with the book's HTML version. If you look at the first stage direction, just before the first speech by Ogre, the first few characters of the direction have been chopped off. It looks like this has happened with all stage directions where the first line is left indented relative to the rest of the direction (a hanging indent). This is with Internet Explorer V6. Mozilla Firefox (V 1.5) displays the stage directions properly. Al ----- Original Message ----- From: "Greg Newby" To: Sent: Thursday, July 06, 2006 9:56 AM Subject: [gutvol-d] Showcase eBook #18131 >I have been perusing our collection for interesting stuff > to put on the new DVD image ("interesting" being as broadly > defined as possible). > > Here's one that is pretty new, but I didn't notice it > earlier. It includes MIDI files and sheet music, as part > of an eBook. It's a great example, to me, of the type of > thing you can do with a computer-based eBook, but not so > easily with plain old paper. > > The Rescue of the Princess Winsome, by Fellows-Johnston and Bacon > 18131 > > http://www.gutenberg.org/etext/18131 (just view the HTML file, it > links to the rest) > > Wow! > -- Greg > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From sly at victoria.tc.ca Thu Jul 6 11:31:29 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jul 6 11:31:33 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: <20060706163924.GB1852@pglaf.org> References: <20060706002028.GB18396@pglaf.org> <7.0.1.0.2.20060706005451.033c4650@baechler.net> <20060706163924.GB1852@pglaf.org> Message-ID: As I am looking up authors etc. for the PG online catalog, and just generally browsing, I seem to be constantly running into more websites that have transcribed material that could be added to PG. With a little effort, I could make a list for you of dozens of sites, with thousands of books that could be adapted. However, getting copyright clearance, and reformatting these is perhaps not as "glamarous" as Distributed Proofreading, so it does not attract as many people. :) Though I already have too many different PG projects I'm in the middle of, I would be willing to help if you'd like to start processing some of these texts. Andrew On Thu, 6 Jul 2006, Greg Newby wrote: > On Thu, Jul 06, 2006 at 01:03:46AM -0700, Tony Baechler wrote: > > samples. Finally, are any of these going to eventually make it to > > the main PG site? Some are public domain and there is no reason why > > they can't be part of PG except for possible layout and pdf issues. > > Nothing will make it to the main PG site without "someone" doing > the work! But most of the public domain content is already > on pgcc.net , so it's not going to go away after August 4. > From sly at victoria.tc.ca Thu Jul 6 11:47:53 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jul 6 11:47:56 2006 Subject: [gutvol-d] New DVD ISO to try In-Reply-To: <20060706164446.GD1852@pglaf.org> References: <20060626093237.GA27369@pglaf.org> <7.0.1.0.2.20060628084333.0426a7b0@baechler.net> <20060704084420.GA11229@pglaf.org> <000201c69f77$af35a190$0132a8c0@blackbox> <7.0.1.0.2.20060706004856.033c04a0@baechler.net> <20060706164446.GD1852@pglaf.org> Message-ID: On Thu, 6 Jul 2006, Greg Newby wrote: > 2) "The" as the first word in the title is an artifact of > the back-end catalog. These basically need to be fixed > by hand. (Yes, I'm sure some could be automated.... Marcello > would like to hear your thoughts on this, I'm certain.) If you are taking your information from the PG online catalog, there should be a field for "non-filing characters" for each title, etc. which indicates how many characters to ignore for sorting purposes. This is used for initial articles. For example the title "The Adventures of Billy" would be marked as having 4 non-filing characters; the title "An adventure with Billy", 3; "Das Leben Billys", 4; "A Long day with Billy", 2; "Les Amours de Billie", 4; "La Maraj Vojagxoj de Bilio", 3. Right now, as new titles added, the common initial articles for German and English are looked after automatically. I've dealt with many of the French ones manually. Hope this helps, Andrew From Bowerbird at aol.com Thu Jul 6 12:01:20 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 6 12:01:38 2006 Subject: [gutvol-d] Pls. test worldebookfair.com Message-ID: <542.2785522.31deb800@aol.com> andrew said: > With a little effort, I could make a list for you > of dozens of sites, with thousands of books > that could be adapted. the real work, of course, is doing the "adapting". but even if p.g. doesn't want the list you'd make -- and i can't see why they would turn it down -- i'm sure plenty of people would find it useful... so go ahead! :+) i'd suggest a wiki, though, so other people could augment it. you can create a free wiki over here: > http://pbwiki.com -bowerbird p.s. that's the first time i've heard d.p. work called "glamorous"... :+) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060706/7500da31/attachment.html From Bowerbird at aol.com Thu Jul 6 12:08:23 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 6 12:08:31 2006 Subject: [gutvol-d] =?iso-8859-1?q?re=3A_the=A0only_literature_people_car?= =?iso-8859-1?q?e_enough_about_to_steal?= Message-ID: <549.229560e.31deb9a7@aol.com> cory doctorow said: > science fiction is the?only literature people > care enough about to steal on the Internet.? > It's the only literature that regularly shows up, > scanned and run through optical character recognition > software and lovingly hand-edited on darknet newsgroups, > Russian websites, IRC channels and?elsewhere. you can find the whole article at: > http://www.locusmag.com/2006/Issues/07DoctorowCommentary.html -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060706/7eab5e86/attachment.html From cannona at fireantproductions.com Thu Jul 6 13:34:15 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Thu Jul 6 13:44:31 2006 Subject: [gutvol-d] New DVD ISO to try References: <20060626093237.GA27369@pglaf.org><7.0.1.0.2.20060628084333.0426a7b0@baechler.net><20060704084420.GA11229@pglaf.org><000201c69f77$af35a190$0132a8c0@blackbox><7.0.1.0.2.20060706004856.033c04a0@baechler.net><20060706164446.GD1852@pglaf.org> Message-ID: <007b01c6a13c$f03df5e0$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hmmm... I wonder if these are included in the RDF output, as that is what the DVD creation system uses. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Andrew Sly" To: ; "Project Gutenberg Volunteer Discussion" Sent: Thursday, July 06, 2006 1:47 PM Subject: Re: [gutvol-d] New DVD ISO to try > > > On Thu, 6 Jul 2006, Greg Newby wrote: > >> 2) "The" as the first word in the title is an artifact of >> the back-end catalog. These basically need to be fixed >> by hand. (Yes, I'm sure some could be automated.... Marcello >> would like to hear your thoughts on this, I'm certain.) > > If you are taking your information from the PG online catalog, > there should be a field for "non-filing characters" for each > title, etc. which indicates how many characters to ignore for > sorting purposes. This is used for initial articles. > For example the title "The Adventures of Billy" > would be marked as having 4 non-filing characters; the title > "An adventure with Billy", 3; "Das Leben Billys", 4; > "A Long day with Billy", 2; "Les Amours de Billie", 4; > "La Maraj Vojagxoj de Bilio", 3. > > Right now, as new titles added, the common > initial articles for German and English are > looked after automatically. I've dealt with > many of the French ones manually. > > Hope this helps, > Andrew > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFErXY1I7J99hVZuJcRAoXOAKDrOaVyJxlMAl1nXdGJhThVO4mGAwCfcxwF l9NMGtWuve+YIXDRJ2aBk5Q= =Z/MM -----END PGP SIGNATURE----- From prosfilaes at gmail.com Thu Jul 6 13:52:04 2006 From: prosfilaes at gmail.com (David Starner) Date: Thu Jul 6 13:52:12 2006 Subject: [gutvol-d] Pls. test worldebookfair.com In-Reply-To: References: <20060706002028.GB18396@pglaf.org> <7.0.1.0.2.20060706005451.033c4650@baechler.net> <20060706163924.GB1852@pglaf.org> Message-ID: <6d99d1fd0607061352p20f2cc31v4f8da61d69d131ff@mail.gmail.com> On 7/6/06, Andrew Sly wrote: > However, getting copyright clearance, and reformatting > these is perhaps not as "glamarous" as Distributed Proofreading, > so it does not attract as many people. :) It's not just glamarous, it's hard. You have to go through all the work of finding a specific edition that may not be well-identified in the ebook. You have to dump the text in such a way that doesn't lose all the formatting information, which may range from easy to hard, but will certainly require custom code and massaging. You have to work with a text that is unlikely to be the quality of what DP can produce after five rounds, and could turn out to be pretty bad. And it requires some tedious comparison. I'd actually rather rescan and reprocess and compare after a lot of times rather than try and reformat existing material. If we can't get information for clearing from the source, or at least handle them as a group, they're pretty hard to do. From joshua at hutchinson.net Thu Jul 6 14:00:06 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jul 6 14:00:11 2006 Subject: [gutvol-d] Pls. test worldebookfair.com Message-ID: <20060706210006.CF24DEE6F6@ws6-1.us4.outblaze.com> I'm currently finishing up a raid of the English texts on reference.bahai.org (I'm scared of the Arabic and Persian originals/translations ;) ). When I'm done with that, I'd love to have a new site to raid. If you have a site or two you want to send my way, I'll look into clearances and raiding their text. Josh > ----- Original Message ----- > From: "Andrew Sly" > > As I am looking up authors etc. for the PG online catalog, > and just generally browsing, I seem to be constantly running > into more websites that have transcribed material that could > be added to PG. With a little effort, I could make a list for > you of dozens of sites, with thousands of books that could be > adapted. However, getting copyright clearance, and reformatting > these is perhaps not as "glamarous" as Distributed Proofreading, > so it does not attract as many people. :) > > Though I already have too many different PG projects I'm in > the middle of, I would be willing to help if you'd like to > start processing some of these texts. > > Andrew > From phil at thalasson.com Thu Jul 6 15:36:50 2006 From: phil at thalasson.com (Philip Baker) Date: Thu Jul 6 15:40:31 2006 Subject: [gutvol-d] Review of "The Wealth of Networks" in the TLS Message-ID: This week's Times Literary Supplement has a review of "The Wealth of Networks" by Yochai Benkler. The book mentions Project Gutenberg. Of more immediate interest is what the reviewer has to say about Project Gutenberg. The relevant part of the review is quoted below. The reviewer is Paul Duguid, Visiting Professor at the School of Information and Management Systems at the University of California, Berkeley. 'Given their openness, both Project Gutenberg and Wikipedia are surprisingly good and unsurprisingly bad. Some thirty years in the making, Gutenberg offers about 17,000 "etexts". Many seem unexceptional, but for some the need to avoid copyright entanglements has led contributors to resurrect editions that were better left buried. Its version of Pan, the novel by Nobel-Prizewinner Knut Hamsun, for example, puts William Wurster's ridiculously prudish translation of 1921 before unsuspecting readers. Relying on a communications medium admired for its ability to "route around censorship", yet driven by a certain contempt for scholarship, Project Gutenberg threatens to make a number of poor editions - some bowdlerized, some originally corrupt, and some newly corrupted for the new medium - the internet standard.' -- Philip Baker From Bowerbird at aol.com Thu Jul 6 16:23:51 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 6 16:23:58 2006 Subject: =?ISO-8859-1?Q?re:=20[gutvol-d]=20Review=A0=20of=20"The=20Wealth?= =?ISO-8859-1?Q?=20of=20Networks"=20in=20the=20TLS?= Message-ID: <494.4f728c8.31def587@aol.com> professor duguid ("do good"?) should digitize _his_ choice of the "best" version of that book, and donate it to project gutenberg. tell him so. then p.g. can make his choice "the internet standard". ;+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060706/6f3be59e/attachment.html From sly at victoria.tc.ca Thu Jul 6 16:41:30 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jul 6 16:41:32 2006 Subject: [gutvol-d] Adapting texts from other collections In-Reply-To: <20060706210006.CF24DEE6F6@ws6-1.us4.outblaze.com> References: <20060706210006.CF24DEE6F6@ws6-1.us4.outblaze.com> Message-ID: I'll reply to three different fellow PG'ers here... bowerbird said: >but even if p.g. doesn't want the list you'd make >-- and i can't see why they would turn it down -- >i'm sure plenty of people would find it useful... > >so go ahead! :+) > >i'd suggest a wiki, though, so other people could >augment it. My thoughts exactly! (Is it a bad sign if I'm thinking the same as bowerbird?) Actually, there is a wiki on the PG website now, as Marcello announced here not long ago, and this is one of the I've had that I've put on a list of possible uses to make of it. David Starner said: >It's not just glamarous, it's hard. You have to go through all the >work of finding a specific edition that may not be well-identified in >the ebook. You have to dump the text in such a way that doesn't lose >all the formatting information, which may range from easy to hard, but >will certainly require custom code and massaging. You have to work >with a text that is unlikely to be the quality of what DP can produce >after five rounds, and could turn out to be pretty bad. And it >requires some tedious comparison. I'd actually rather rescan and >reprocess and compare after a lot of times rather than try and >reformat existing material. You're right. I've gone through a process like this perhaps 25 times in reformatting a text for PG, and I think almost every time it's ended up being a bigger job than I had intended. And yet, I keep seeing more material that has had a lot of effort put into it already, and may disappear some year, if not put into PG. Joshua Hutchinson said: >I'm currently finishing up a raid of the English texts on reference.bahai.org >(I'm scared of the Arabic and Persian originals/translations ;) ). When I'm >done with that, I'd love to have a new site to raid. If you have a site or two >you want to send my way, I'll look into clearances and raiding their text. How about some Russian? Now that I'm able to recognize letters of the cyrillic alphabet, I've discussed with a Russian-speaking DP volunteer the possibility of adapting some of the classics texts from from lib.ru (My local university library has a surprising amount of pre-1923 volumes in Russian, including complete Pushkin, Lermontov, Gogol, et al.) Andrew From Bowerbird at aol.com Thu Jul 6 16:44:37 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 6 16:44:43 2006 Subject: [gutvol-d] re: a bad sign Message-ID: <426.5b00444.31defa65@aol.com> andrew said: > (Is it a bad sign if I'm thinking the same as bowerbird?) you betcha. check yourself in to the psych ward tomorrow. ;+) -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060706/399a9f16/attachment.html From joshua at hutchinson.net Thu Jul 6 18:53:22 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Thu Jul 6 18:53:26 2006 Subject: [gutvol-d] Adapting texts from other collections Message-ID: <20060707015322.695BCDA59F@ws6-6.us4.outblaze.com> > ----- Original Message ----- > From: "Andrew Sly" > > How about some Russian? Now that I'm able to recognize > letters of the cyrillic alphabet, I've discussed with a > Russian-speaking DP volunteer the possibility of adapting > some of the classics texts from from lib.ru > (My local university library has a surprising amount > of pre-1923 volumes in Russian, including complete > Pushkin, Lermontov, Gogol, et al.) > Actually, as long as I don't have to doing any spell-checking, I'm good! :) It wasn't so much the foreign language aspects of the Arabic and Persian texts that scared me ... It was the fact the text flows from left to right and doesn't seem to have the same paragraph type structure that I'm used to in European languages. I imagine Chinese and Japanese texts would scare me equally! :) Josh From Bowerbird at aol.com Thu Jul 6 23:04:41 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 6 23:04:53 2006 Subject: [gutvol-d] the long tail, v2006 Message-ID: <576.310e96.31df5379@aol.com> see chris anderson's 2006 take on the long tail: > http://www.wired.com/wired/archive/14.07/longtail.html this article is just _dripping_ with juicy quotes, info, and insight. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060707/2cdb14a4/attachment.html From jon.ingram at gmail.com Fri Jul 7 00:22:57 2006 From: jon.ingram at gmail.com (Jon Ingram) Date: Fri Jul 7 00:23:00 2006 Subject: [gutvol-d] re: the only literature people care enough about to steal Message-ID: <4baf53720607070022ha442e4eof687c6fdd389258b@mail.gmail.com> On 7/6/06, Bowerbird@aol.com wrote: > cory doctorow said: > > science fiction is the only literature people > > care enough about to steal on the Internet. > > It's the only literature that regularly shows up, > > scanned and run through optical character recognition > > software and lovingly hand-edited on darknet newsgroups, > > Russian websites, IRC channels and elsewhere. > > you can find the whole article at: > > > http://www.locusmag.com/2006/Issues/07DoctorowCommentary.html An odd article, because the title really has nothing to do with the rest of the text. In a way this is good, because the title is incorrect. There's plenty of non-Science Fiction being 'stolen' on the internet, all the way from technical manuals, scans of magazines (both porn and non-porn), and non-fiction, to literary fiction. It's probably true that SciFi is over-represented, but that's a more subtle and interesting point. -- Jon Ingram From JBuck814366460 at aol.com Fri Jul 7 02:13:30 2006 From: JBuck814366460 at aol.com (Jared Buck) Date: Fri Jul 7 02:13:32 2006 Subject: [gutvol-d] re: the only literature people care enough about to steal In-Reply-To: <4baf53720607070022ha442e4eof687c6fdd389258b@mail.gmail.com> References: <4baf53720607070022ha442e4eof687c6fdd389258b@mail.gmail.com> Message-ID: <44AE25BA.10003@aol.com> I've spoken to Dr. Doctorow on a number of occasions online and i really like his writing. Haven't read the article yet but i plan to soon. Jared Jon Ingram wrote on 07/07/2006, 12:22 AM: > On 7/6/06, Bowerbird@aol.com wrote: > > cory doctorow said: > > > science fiction is the only literature people > > > care enough about to steal on the Internet. > > > It's the only literature that regularly shows up, > > > scanned and run through optical character recognition > > > software and lovingly hand-edited on darknet newsgroups, > > > Russian websites, IRC channels and elsewhere. > > > > you can find the whole article at: > > > > > http://www.locusmag.com/2006/Issues/07DoctorowCommentary.html > > An odd article, because the title really has nothing to do with the > rest of the text. In a way this is good, because the title is > incorrect. There's plenty of non-Science Fiction being 'stolen' on the > internet, all the way from technical manuals, scans of magazines (both > porn and non-porn), and non-fiction, to literary fiction. It's > probably true that SciFi is over-represented, but that's a more subtle > and interesting point. > > -- > Jon Ingram > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -- From gbnewby at pglaf.org Fri Jul 7 03:30:44 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Jul 7 03:30:45 2006 Subject: [gutvol-d] DVD: last check Message-ID: <20060707103044.GA21769@pglaf.org> I have not confirmed this will actually fit on a DVD...that's for tomorrow. The ISO will be done in a few minutes, but you can also browse the DVD online: http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special/index.htm http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special.iso There were good suggestions, and I think I managed to act on all of them. Thanks for all your thoughts on this... -- Greg From cannona at fireantproductions.com Fri Jul 7 06:26:41 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Jul 7 06:27:06 2006 Subject: [gutvol-d] DVD: last check References: <20060707103044.GA21769@pglaf.org> Message-ID: <000301c6a1c8$fee501b0$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 You'll need to edit the autorun.inf file. Since you changed from index.html to .htm, you'll need to change the autorun.inf file as well. Just change the .html to .htm and it will work fine. Also, it should fit with no problems. I believe you even have about 350 MB left over. You might think about changing the label of the disc to something which mentions PG, if only in abbreviation. I think the original DVD was PGDVD or some such, and the CD was pg-2003-08 I believe. Perhaps something like PGDVD072006. Just a thought, but not a big deal. Finally, it might be cool if someone were to create an .ico file of the PG logo, and that could be added to the autorun.inf file to give the disc an icon under windows. Totally without purpose, except that it might look cool. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Greg Newby" To: Sent: Friday, July 07, 2006 5:30 AM Subject: [gutvol-d] DVD: last check >I have not confirmed this will actually fit on a DVD...that's > for tomorrow. The ISO will be done in a few minutes, but > you can also browse the DVD online: > > http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special/index.htm > > http://snowy.arsc.alaska.edu/gbn/pgimages/jul06special.iso > > There were good suggestions, and I think I managed to act > on all of them. Thanks for all your thoughts on this... > -- Greg > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFErmExI7J99hVZuJcRAgzBAJ9QZUyV/C3Pt6T1qZjr+RufwGoQEACfbCIP 4cWfIsIrgEDoE4fCJLzxY24= =hff/ -----END PGP SIGNATURE----- From donovan at abs.net Fri Jul 7 06:54:34 2006 From: donovan at abs.net (D Garcia) Date: Fri Jul 7 06:55:07 2006 Subject: [dp-pg] Re: [gutvol-d] DVD: last check In-Reply-To: <000301c6a1c8$fee501b0$0132a8c0@blackbox> References: <20060707103044.GA21769@pglaf.org> <000301c6a1c8$fee501b0$0132a8c0@blackbox> Message-ID: <200607070954.34942.donovan@abs.net> On Friday 07 July 2006 09:26 am, Aaron Cannon wrote: > Finally, it might be cool if someone were to create an .ico file of the PG > logo, and that could be added to the autorun.inf file to give the disc an > icon under windows. Totally without purpose, except that it might look > cool. Create? The PG website icon favicon.ico (name might not be right, I have a cold and am working from memory ... the one that shows up in your browser) is an .ico file. Should be able to use it as-is. From cannona at fireantproductions.com Fri Jul 7 07:53:03 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Jul 7 07:54:22 2006 Subject: [dp-pg] Re: [gutvol-d] DVD: last check References: <20060707103044.GA21769@pglaf.org><000301c6a1c8$fee501b0$0132a8c0@blackbox> <200607070954.34942.donovan@abs.net> Message-ID: <000601c6a1d5$20793c40$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Right you are. It's at http://www.gutenberg.org/favicon.ico . All that needs to be done is add it to the root directory of the disc and the line: icon=favicon.ico to the autorun file and it should work. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "D Garcia" To: "Project Gutenberg Volunteer Discussion" Sent: Friday, July 07, 2006 8:54 AM Subject: Re: [dp-pg] Re: [gutvol-d] DVD: last check > On Friday 07 July 2006 09:26 am, Aaron Cannon wrote: >> Finally, it might be cool if someone were to create an .ico file of the >> PG >> logo, and that could be added to the autorun.inf file to give the disc an >> icon under windows. Totally without purpose, except that it might look >> cool. > > Create? > The PG website icon favicon.ico (name might not be right, I have a cold > and am > working from memory ... the one that shows up in your browser) is an .ico > file. Should be able to use it as-is. > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFErnWmI7J99hVZuJcRAtSLAKCfVftMzGPlF9D4gb7IV6Rll0gBxgCg28gV ebFKoC1E138JT9qUniSK/vU= =TuGc -----END PGP SIGNATURE----- From Bowerbird at aol.com Fri Jul 7 12:10:54 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 7 12:11:00 2006 Subject: [gutvol-d] DVD: last check Message-ID: <53f.2b98e6e.31e00bbe@aol.com> greg- here is a slight reworking of that info-page you made on the d.v.d. search for question-marks to find questions i had... -bowerbird =========================================================== "july 2006 special" -- current as of ebook #18739 =========================================================== baseline: everything but the hgp: 1-2199 (then skip 2200-2224) 2225-3500 (then skip 3501-3524) 3525-11774 (then skip 11775-11799) 11800-20000 =========================================================== particular items of interest included that are not text and not html: 116 (zip/avi, select "all") -- apollo 11 moon landing movie 156 (midi, select "all") -- beethoven's 5th symphony audio 249 (zip/html) -- french cave-paintings pictures 256 (zip/mpg) -- rotating-earth movie 3002 (mp3) -- janis ian, society's child audio 5212-5216 ("all") -- a-bomb videos (was "5212-5215"? -- is 5216 a compilation?) 9551 ("all") -- human-read sherlock holmes audio 10177 ("all") -- ride of the valkyries, audio 17246 ("all") , but it doesn't include all the mp3s -- wrong e-text #? =========================================================== selected top-100 titles to specify as html: 11 -- alice in wonderland (does this mean you used #928?) 132 -- art of war (does this mean you used #17405?) 5000 (da vinci notebooks, html?, complete set of #4998 and #4999?) 5001 -- einstein's relativity 5200 -- "metamorphosis" (anything special about this html file?) 8710 -- dore bible illustrations 8800 -- dante's divine comedy (only available as html download) 9551 -- human-read sherlock holmes 10681 -- roget's thesaurus (heavily formatted with styles) 13510 -- "knots, splices and rope work" =========================================================== a few extras to specify as html: 10600 -- kerr's "voyages and travels" (but no images in this file?) illustrated beatrix potter: 17089, 15575, 15284, 15234, 15137, 15077, 14877, 14872, 14868, 14848, 14838, 14837, 14814, 14797, 14407, 14304, 14220, 12103 the first 20 punch: -- (all of the punch are listed separately below) 18114, 17994, 17654, 17653, 17634, 17629, 17596, 17471, 17397, 17216, 16877, 16727, 16717, 16707, 16684, 16673, 16640, 16628, 16619, 16592, the sciam (232mb) -- (listed below) -- (so, were all these included on the dvd?) =========================================================== eliminate some titles that are part of series. these "complete" volumes were skipped, and their individual volumes were retained.) to skip (a total of 245 duplicate "completes"): 17216, 16205, 16190, 16146, 13260, 13042, 12242, 12215, 12161, 11996, 11976, 10876, 9774, 9761, 9755, 9670, 9600, 9450, 9320, 9170, 9169, 8800, 8726, 8710, 8562, 8525, 8516, 8505, 8460, 8100, 7878, 7852, 7761, 7756, 7749, 7735, 7727, 7714, 7701, 7691, 7684, 7671, 7658, 7649, 7639, 7630, 7623, 7614, 7608, 7605, 7535, 7420, 7400, 7332, 7317, 7290, 7140, 7025, 7005, 6944, 6941, 6780, 6775, 6761, 6615, 6516, 6478, 6400, 6300, 6299, 6295, 6291, 6288, 6284, 6280, 6274, 6271, 6267, 6260, 6253, 6249, 6241, 6236, 6229, 6222, 6217, 6214, 6210, 6205, 6201, 6194, 6191, 6179, 6156, 6098, 5999, 5998, 5946, 5921, 5668, 5650, 5600, 5587, 5583, 5577, 5571, 5560, 5551, 5542, 5529, 5516, 5507, 5499, 5493, 5482, 5472, 5466, 5460, 5449, 5416, 5400, 5396, 5387, 5382, 5373, 5364, 5355, 5300, 5240, 5225, 5060, 5059, 5058, 5057, 5056, 5055, 5000, 4973, 4912, 4900, 4899, 4885, 4884, 4872, 4860, 4847, 4836, 4800, 4645, 4546, 4500, 4491, 4488, 4482, 4476, 4470, 4464, 4460, 4452, 4443, 4434, 4426, 4420, 4412, 4405, 4397, 4367, 4362, 4361, 4330, 4270, 4269, 4264, 4261, 4200, 4199, 4195, 4184, 4171, 4162, 4153, 4145, 4138, 4131, 4125, 4116, 4107, 3999, 3995, 3990, 3985, 3980, 3975, 3971, 3967, 3962, 3957, 3953, 3946, 3942, 3938, 3934, 3930, 3926, 3922, 3918, 3913, 3899, 3883, 3859, 3854, 3846, 3841, 3766, 3739, 3684, 3649, 3600, 3580, 3567, 3545, 3534, 3374, 3350, 3254, 3253, 3252, 3199, 3189, 3178, 3177, 3176, 3125, 3090, 3072, 2988, 2895, 2760, 2270, 2144, 1837, 100, 86, 76, 74 =========================================================== these individual volumes skipped, their "complete" version retained. (do i have that right?) 11801-11856 8301-8373 8228-8293 8001-8066 6419-6420 6348-6349 6161 5010-5049 1609-1610 1581-1582 =========================================================== hgp items to skip (the reverse of the first list above): 11775-11799 3501-3524 -- (the original "4501-3524" was a mistake?) 2200-2224 =========================================================== here are all the punch/punchinello issues (about 660mb): 18114, 17994, 17654, 17653, 17634, 17629, 17596, 17471, 17397, 17216, 16877, 16727, 16717, 16707, 16684, 16673, 16640, 16628, 16619, 16592, 16563, 16509, 16401, 16394, 16364, 16281, 16271, 16263, 16213, 16152, 16113, 16107, 15973, 15957, 15912, 15742, 15688, 15677, 15657, 15615, 15605, 15594, 15512, 15453, 15442, 15441, 15439, 15377, 15366, 15332, 15330, 15196, 15166, 15144, 15142, 15121, 15064, 15049, 15026, 15021, 15012, 14991, 14974, 14973, 14966, 14965, 14942, 14941, 14940, 14939, 14938, 14937, 14936, 14935, 14934, 14933, 14932, 14931, 14930, 14929, 14928, 14927, 14926, 14925, 14924, 14923, 14922, 14921, 14920, 14919, 14856, 14846, 14845, 14808, 14787, 14769, 14767, 14747, 14745, 14707, 14695, 14694, 14690, 14652, 14639, 14601, 14592, 14544, 14516, 14514, 14483, 14455, 14452, 14450, 14390, 14389, 14365, 14364, 14344, 14341, 14321, 14277, 14272, 14250, 14231, 14229, 14217, 14199, 14186, 14166, 14165, 14146, 14141, 14135, 14123, 14122, 14093, 14074, 14067, 14057, 14053, 14046, 13995, 13994, 13966, 13961, 13954, 13927, 13903, 13710, 13639, 13563, 13538, 13503, 13502, 13491, 13466, 13465, 13446, 13422, 13421, 13391, 13390, 13373, 13352, 13348, 13327, 13323, 13313, 13297, 13283, 13281, 13270, 13269, 13253, 13252, 13244, 13186, 13185, 13098, 13074, 13067, 12951, 12944, 12934, 12917, 12905, 12872, 12866, 12860, 12825, 12739, 12738, 12737, 12614, 12536, 12517, 12469, 12468, 12467, 12466, 12465, 12395, 12394, 12393, 12392, 12378, 12323, 12306, 12305, 12294, 12292, 12262, 12232, 12231, 12114, 12079, 12043, 11963, 11919, 11910, 11908, 11907, 11872, 11868, 11732, 11726, 11712, 11704, 11670, 11638, 11630, 11629, 11619, 11617, 11571, 11570, 11491, 11466, 11444, 11443, 11429, 11428, 11425, 11359, 11284, 11225, 11201, 11177, 11169, 11133, 11109, 11094, 11076, 10964, 10952, 10934, 10933, 10923, 10903, 10721, 10711, 10663, 10614, 10595, 10594, 10544, 10450, 10292, 10144, 10143, 10106, 10105, 10104, 10092, 10091, 10047, 10036, 10035, 10034, 10033, 10032, 10019, 10018, 10017, 10016, 10015, 10014, 10013, 9962, 9961, 9960, 9953, 9898, 9885, 9877, 9819, 9797, 9658, 9636, 9549, 9545, 9544, 9481, 8643, 8433 =========================================================== all the scientific american supplements (about 235mb) 18345, 18265, 17817, 17755, 17167, 16972, 16948, 16792, 16773, 16671, 16360, 16354, 16353, 16270, 15889, 15833, 15831, 15708, 15417, 15193, 15052, 15051, 15050, 14990, 14989, 14097, 14041, 14009, 13962, 13939, 13640, 13443, 13401, 13399, 13358, 12490, 11761, 11736, 11735, 11734, 11662, 11649, 11648, 11647, 11498, 11385, 11383, 11344, 9666, 9266, 4], 9163, 9076, 8952, 8951, 8950, 8862, 8742, 8718, 8717, 8687, 8559, 8504, 8484, 8483, 8452, 8408, 8391, 8297, 8296, 8195, =========================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060707/92e9af28/attachment.html From gbnewby at pglaf.org Fri Jul 7 12:12:10 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Jul 7 12:12:13 2006 Subject: [dp-pg] Re: [gutvol-d] DVD: last check In-Reply-To: <000601c6a1d5$20793c40$0132a8c0@blackbox> References: <200607070954.34942.donovan@abs.net> <000601c6a1d5$20793c40$0132a8c0@blackbox> Message-ID: <20060707191210.GC2905@pglaf.org> On Fri, Jul 07, 2006 at 09:53:03AM -0500, Aaron Cannon wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Right you are. It's at http://www.gutenberg.org/favicon.ico . All that > needs to be done is add it to the root directory of the disc and the line: > icon=favicon.ico > to the autorun file and it should work. > > Sincerely > Aaron Cannon > Got it: [autorun] open=rundll32.exe url.dll,FileProtocolHandler index.htm icon=favicon.ico > - -- > Skype: cannona > MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail > address.) > - ----- Original Message ----- > From: "D Garcia" > To: "Project Gutenberg Volunteer Discussion" > Sent: Friday, July 07, 2006 8:54 AM > Subject: Re: [dp-pg] Re: [gutvol-d] DVD: last check > > > >On Friday 07 July 2006 09:26 am, Aaron Cannon wrote: > >>Finally, it might be cool if someone were to create an .ico file of the > >>PG > >>logo, and that could be added to the autorun.inf file to give the disc an > >>icon under windows. Totally without purpose, except that it might look > >>cool. > > > >Create? > >The PG website icon favicon.ico (name might not be right, I have a cold > >and am > >working from memory ... the one that shows up in your browser) is an .ico > >file. Should be able to use it as-is. > >_______________________________________________ > >gutvol-d mailing list > >gutvol-d@lists.pglaf.org > >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 > Comment: Key available from all major key servers. > > iD8DBQFErnWmI7J99hVZuJcRAtSLAKCfVftMzGPlF9D4gb7IV6Rll0gBxgCg28gV > ebFKoC1E138JT9qUniSK/vU= > =TuGc > -----END PGP SIGNATURE----- > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From gbnewby at pglaf.org Fri Jul 7 13:54:34 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Jul 7 13:54:35 2006 Subject: [gutvol-d] Re: DVD: last check In-Reply-To: <53f.2b98e6e.31e00bbe@aol.com> References: <53f.2b98e6e.31e00bbe@aol.com> Message-ID: <20060707205434.GA5067@pglaf.org> More fixes applied....added the .ico, added a few inspirational quotes from our content, -- Greg From cannona at fireantproductions.com Fri Jul 7 14:18:26 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Fri Jul 7 14:30:39 2006 Subject: [gutvol-d] Re: DVD: last check References: <53f.2b98e6e.31e00bbe@aol.com> <20060707205434.GA5067@pglaf.org> Message-ID: <000b01c6a20c$96ba8030$0132a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Looks like the dvd was also moved. http://snowy.arsc.alaska.edu/gbn/pgimages/pgdvd072006/ is the online version. Just say the word and I'll get it up on the torrent tracker. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) - ----- Original Message ----- From: "Greg Newby" To: Sent: Friday, July 07, 2006 3:54 PM Subject: [gutvol-d] Re: DVD: last check > More fixes applied....added the .ico, > added a few inspirational quotes from our content, > > -- Greg > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFErtKMI7J99hVZuJcRAvd4AKCtpw0k7lSJteGX45OzO04xUxCYIACgnB1v uvQIQv1tCi04A+BluD2rXk0= =Fcwy -----END PGP SIGNATURE----- From gbnewby at pglaf.org Fri Jul 7 16:05:13 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Fri Jul 7 16:05:15 2006 Subject: [gutvol-d] Re: DVD: last check In-Reply-To: <000b01c6a20c$96ba8030$0132a8c0@blackbox> References: <53f.2b98e6e.31e00bbe@aol.com> <20060707205434.GA5067@pglaf.org> <000b01c6a20c$96ba8030$0132a8c0@blackbox> Message-ID: <20060707230513.GA8091@pglaf.org> On Fri, Jul 07, 2006 at 04:18:26PM -0500, Aaron Cannon wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Looks like the dvd was also moved. > http://snowy.arsc.alaska.edu/gbn/pgimages/pgdvd072006/ is the online > version. Right. The ISO: http://snowy.arsc.alaska.edu/gbn/pgimages/pgdvd072006.iso The checksum: http://snowy.arsc.alaska.edu/gbn/pgimages/pgdvd072006.md5 > Just say the word and I'll get it up on the torrent tracker. Go for it! Though I might relocate it at some point. I made a physical DVD, it's just fine. I don't know why it's not as full as I thought...the automated tool must do some poor math somewhere. The main anomaly, which I won't try to fix right now, is that we have many cases where there is a -8.zip, a -0.zip and a .zip for different character sets but the same eBook. I also found a title that just didn't make it, and assume there are more. So, we might get these fixed, but the DVD is "good enough" to make some copies, and to put up for people to experiment with. -- Greg > Sincerely > Aaron Cannon > > > - -- > Skype: cannona > MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail > address.) > - ----- Original Message ----- > From: "Greg Newby" > To: > Sent: Friday, July 07, 2006 3:54 PM > Subject: [gutvol-d] Re: DVD: last check > > > >More fixes applied....added the .ico, > >added a few inspirational quotes from our content, > > > > -- Greg > >_______________________________________________ > >gutvol-d mailing list > >gutvol-d@lists.pglaf.org > >http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 > Comment: Key available from all major key servers. > > iD8DBQFErtKMI7J99hVZuJcRAvd4AKCtpw0k7lSJteGX45OzO04xUxCYIACgnB1v > uvQIQv1tCi04A+BluD2rXk0= > =Fcwy > -----END PGP SIGNATURE----- From urbangleaner56 at yahoo.com Sat Jul 8 17:48:35 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sat Jul 8 17:55:18 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations Message-ID: <20060709004835.98347.qmail@web38510.mail.mud.yahoo.com> Hi; Obviously, I'm new here. I realize that the Project is mainly text driven, but... I down loaded Hans Christian Anderson Fairy Tales edition with Edmund Dulac's illustrations-after jumping around the room in joyful glee-then realized that the dpi on the images isn't high enough to print them out, with the rest of the book. At least, not very well. Then, the images-or scans of them-were so dark that you can't even really see them. I'm assuming that that's mainly due to the age of the book. Anyway, I took the liberty of working with the illos a little... I lightened them a little, so they were viewable, and placed them in a folder in my computer, leaving the original image file in the ebook folder alone. I'm more than happy to send these lightened images to someone, to see if they want to re-place the exsisting image file in the ebook with THIS file. They aren't any higher dpi ofcourse, but at least people will be able to see them. But if there is an interest, I need to know who as well as how to send them. My question is this... why aren't the images higher res? I realize it would take up more space, but it would be wonderful to be able to have good quality illos to go along with the text. But then, besides being a reader, I'm a visual artist myself. But it would be wonderful, especially given the fact that so much of this artwork is completly inaccessable to the public, or if available in poster/print form, its anywhere from $45-$80 or more for a small reproduction, and this is work thats in the public domain! Anyway, sorry about the rant. Leigh --------------------------------- Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2?/min or less. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060708/714c91d3/attachment.html From grythumn at gmail.com Sat Jul 8 18:44:07 2006 From: grythumn at gmail.com (Robert Cicconetti) Date: Sat Jul 8 18:45:42 2006 Subject: [gutvol-d] Comments & Questions About Book Illustration In-Reply-To: <20060709004835.98347.qmail@web38510.mail.mud.yahoo.com> References: <20060709004835.98347.qmail@web38510.mail.mud.yahoo.com> Message-ID: <15cfa2a50607081844i5b8dd42fkad50d9253bc55455@mail.gmail.com> Current policy on DP books[1] is to have high-resolution scans of the illustrations archived along with the projects.. they will eventually be made available for people to use. Keep in mind raw scans are extremely expensive in terms of disk space/bandwidth.. a mildly graphics heavy book can easily hit 200-300 megs, while a book with a lot of color plates can easily exceed several gigabytes. (Before descreening, etc. I scan all illos at 600 DPI as PNGs; after descreening and downscaling to 300 DPI the size drops dramatically.) R C [1] Not sure when it went into effect. Probably sometime after the Project Manager/Post Processor jobs were split. On 7/8/06, Jacqulyn Perry wrote: > > Hi; > Obviously, I'm new here. I realize that the Project is mainly text driven, > but... I down loaded Hans Christian Anderson Fairy Tales edition with > Edmund Dulac's illustrations-after jumping around the room in joyful > glee-then realized that the dpi on the images isn't high enough to print > them out, with the rest of the book. At least, not very well. Then, the > images-or scans of them-were so dark that you can't even really see them. > I'm assuming that that's mainly due to the age of the book. Anyway, I took > the liberty of working with the illos a little... I lightened them a little, > so they were viewable, and placed them in a folder in my computer, leaving > the original image file in the ebook folder alone. > I'm more than happy to send these lightened images to someone, to see if > they want to re-place the exsisting image file in the ebook with THIS file. > They aren't any higher dpi ofcourse, but at least people will be able to see > them. But if there is an interest, I need to know who as well as how to send > them. > > My question is this... why aren't the images higher res? I realize it > would take up more space, but it would be wonderful to be able to have good > quality illos to go along with the text. But then, besides being a reader, > I'm a visual artist myself. But it would be wonderful, especially given the > fact that so much of this artwork is completly inaccessable to the public, > or if available in poster/print form, its anywhere from $45-$80 or more for > a small reproduction, and this is work thats in the public domain! > > Anyway, sorry about the rant. > Leigh > > ------------------------------ > Yahoo! Messenger with Voice. Make PC-to-Phone Callsto the US (and 30+ countries) for 2?/min or less. > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060708/3d165f19/attachment.html From urbangleaner56 at yahoo.com Sat Jul 8 19:51:29 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sat Jul 8 19:51:32 2006 Subject: [gutvol-d] Comments & Questions About Book Illustration In-Reply-To: <15cfa2a50607081844i5b8dd42fkad50d9253bc55455@mail.gmail.com> Message-ID: <20060709025129.40318.qmail@web38507.mail.mud.yahoo.com> Okay, that makes sense... would the powers that be want to take a looksee at the image file of 'lightened' images? Leigh Robert Cicconetti wrote: Current policy on DP books[1] is to have high-resolution scans of the illustrations archived along with the projects.. they will eventually be made available for people to use. Keep in mind raw scans are extremely expensive in terms of disk space/bandwidth.. a mildly graphics heavy book can easily hit 200-300 megs, while a book with a lot of color plates can easily exceed several gigabytes. (Before descreening, etc. I scan all illos at 600 DPI as PNGs; after descreening and downscaling to 300 DPI the size drops dramatically.) R C [1] Not sure when it went into effect. Probably sometime after the Project Manager/Post Processor jobs were split. On 7/8/06, Jacqulyn Perry wrote: Hi; Obviously, I'm new here. I realize that the Project is mainly text driven, but... I down loaded Hans Christian Anderson Fairy Tales edition with Edmund Dulac's illustrations-after jumping around the room in joyful glee-then realized that the dpi on the images isn't high enough to print them out, with the rest of the book. At least, not very well. Then, the images-or scans of them-were so dark that you can't even really see them. I'm assuming that that's mainly due to the age of the book. Anyway, I took the liberty of working with the illos a little... I lightened them a little, so they were viewable, and placed them in a folder in my computer, leaving the original image file in the ebook folder alone. I'm more than happy to send these lightened images to someone, to see if they want to re-place the exsisting image file in the ebook with THIS file. They aren't any higher dpi ofcourse, but at least people will be able to see them. But if there is an interest, I need to know who as well as how to send them. My question is this... why aren't the images higher res? I realize it would take up more space, but it would be wonderful to be able to have good quality illos to go along with the text. But then, besides being a reader, I'm a visual artist myself. But it would be wonderful, especially given the fact that so much of this artwork is completly inaccessable to the public, or if available in poster/print form, its anywhere from $45-$80 or more for a small reproduction, and this is work thats in the public domain! Anyway, sorry about the rant. Leigh --------------------------------- Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2?/min or less. _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------------- Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2?/min or less. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060708/03bf4f62/attachment.html From sly at victoria.tc.ca Sat Jul 8 23:42:56 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Sat Jul 8 23:42:59 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <20060709004835.98347.qmail@web38510.mail.mud.yahoo.com> References: <20060709004835.98347.qmail@web38510.mail.mud.yahoo.com> Message-ID: Hi Leigh. Thanks for sharing your ideas. I suppose an answer for your question is that, as you mention, PG is "mainly text driven". For a general user, images are needed that are accessible for download over a non-high-speed connection. Also, we need to remember that, in these cases, how the images are prepared is up to each individual volunteer. I find dealing with images a pain, because unlike the relatively straight-forward text, there are so many variables in digatizing images. I'm now leaning towards transcribing this picture book using reduced-size jpgs (averaging 375 by 530 pixels) that will hopefully make for a smoothly loading html file for most users. Then, I'll include ziped hi-res page images for the occasional person who might want them. Andrew On Sat, 8 Jul 2006, Jacqulyn Perry wrote: > Hi; > Obviously, I'm new here. I realize that the Project is mainly text driven, but... I down loaded Hans Christian Anderson Fairy Tales edition with Edmund Dulac's illustrations-after jumping around the room in joyful glee-then realized that the dpi on the images isn't high enough to print them out, with the rest of the book. At least, not very well. Then, the images-or scans of them-were so dark that you can't even really see them. I'm assuming that that's mainly due to the age of the book. Anyway, I took the liberty of working with the illos a little... I lightened them a little, so they were viewable, and placed them in a folder in my computer, leaving the original image file in the ebook folder alone. > I'm more than happy to send these lightened images to someone, to see if they want to re-place the exsisting image file in the ebook with THIS file. They aren't any higher dpi ofcourse, but at least people will be able to see them. But if there is an interest, I need to know who as well as how to send them. > > My question is this... why aren't the images higher res? I realize it would take up more space, but it would be wonderful to be able to have good quality illos to go along with the text. But then, besides being a reader, I'm a visual artist myself. But it would be wonderful, especially given the fact that so much of this artwork is completly inaccessable to the public, or if available in poster/print form, its anywhere from $45-$80 or more for a small reproduction, and this is work thats in the public domain! > > Anyway, sorry about the rant. > Leigh > From urbangleaner56 at yahoo.com Sun Jul 9 02:37:16 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sun Jul 9 02:37:20 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: Message-ID: <20060709093716.13839.qmail@web38509.mail.mud.yahoo.com> Hi Andrew! That sounds like a great idea, and a sensible compromise! Is there a way I could help with this? I'm not sure HOW I could help, I don't have access to the books, as I seriously doubt I could get my hands on a copy of any of these books-most are collectors items and way outside my budget. In the meantime, I still have the re-worked images I mentioned, which I would very much like to pass on to you or whomever. They are the same ones in the original file, just re-worked a little so you can actually see them, rather than a dark blur. Leigh Andrew Sly wrote: Hi Leigh. Thanks for sharing your ideas. I suppose an answer for your question is that, as you mention, PG is "mainly text driven". For a general user, images are needed that are accessible for download over a non-high-speed connection. Also, we need to remember that, in these cases, how the images are prepared is up to each individual volunteer. I find dealing with images a pain, because unlike the relatively straight-forward text, there are so many variables in digatizing images. I'm now leaning towards transcribing this picture book using reduced-size jpgs (averaging 375 by 530 pixels) that will hopefully make for a smoothly loading html file for most users. Then, I'll include ziped hi-res page images for the occasional person who might want them. Andrew On Sat, 8 Jul 2006, Jacqulyn Perry wrote: > Hi; > Obviously, I'm new here. I realize that the Project is mainly text driven, but... I down loaded Hans Christian Anderson Fairy Tales edition with Edmund Dulac's illustrations-after jumping around the room in joyful glee-then realized that the dpi on the images isn't high enough to print them out, with the rest of the book. At least, not very well. Then, the images-or scans of them-were so dark that you can't even really see them. I'm assuming that that's mainly due to the age of the book. Anyway, I took the liberty of working with the illos a little... I lightened them a little, so they were viewable, and placed them in a folder in my computer, leaving the original image file in the ebook folder alone. > I'm more than happy to send these lightened images to someone, to see if they want to re-place the exsisting image file in the ebook with THIS file. They aren't any higher dpi ofcourse, but at least people will be able to see them. But if there is an interest, I need to know who as well as how to send them. > > My question is this... why aren't the images higher res? I realize it would take up more space, but it would be wonderful to be able to have good quality illos to go along with the text. But then, besides being a reader, I'm a visual artist myself. But it would be wonderful, especially given the fact that so much of this artwork is completly inaccessable to the public, or if available in poster/print form, its anywhere from $45-$80 or more for a small reproduction, and this is work thats in the public domain! > > Anyway, sorry about the rant. > Leigh > _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------------- Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/1fba70bc/attachment-0001.html From cannona at fireantproductions.com Sun Jul 9 05:07:48 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Jul 9 05:08:09 2006 Subject: [gutvol-d] new dvd image Message-ID: <000301c6a350$508ac260$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello all. The new dvd image is available as a torrent. http://snowy.arsc.alaska.edu:6969 . My email is once again doing strange things, so I seem to only be able to send, but not receive. However, it doesn't seem to be bouncing, so if you sent me something, hopefully I'll get it eventually, but perhaps not. Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFEsPGlI7J99hVZuJcRAivqAJ41UaZW0AFrqHLRRWxKtVXPJgozRQCg0x51 2X5uERI+EPFoCQ398haa6jw= =eh2z -----END PGP SIGNATURE----- From sly at victoria.tc.ca Sun Jul 9 09:20:45 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Jul 9 09:20:48 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <20060709093716.13839.qmail@web38509.mail.mud.yahoo.com> References: <20060709093716.13839.qmail@web38509.mail.mud.yahoo.com> Message-ID: Leigh: How accomplished are you as editing images? If you'd like to help with this particular item I have underway, one image could use a small clean-up where there seems to be a smudge of blue ink. Take a look at http://www.victoria.tc.ca/~sly/pb.htm (See the first "alphabet" image, on the baby's forehead.) This is beyond what I feel comfortable dealing with. If you are interested, I have a 1.4mb png file that I would request to have edited and returned in the same format. Or, were you asking in a more general point of view? By far the easiest way for a new volunteer to contribute to Project Gutenberg is to sign up at Distributed Proofreaders (pgdp.net) and help with one page at a time. And about your re-worked images, it would be best to ask one of the white-washers, (people who actually post files). Perhaps via errata [at] pglaf.org. In that case, please include more full details, such as PG number of the ebook. In this case, I can predict that a likely response would be that lightened images would be welcome if you can make them from an original source. In the case at hand the files included are jpgs. Jpgs use a lossy compression, which means that each time you save them, some information is lost. So if you take a jpg, edit it and save it again, you are probably going to have a file which is both larger in size and poorer in quality than what you started with. If you really want to perserver on this text, you could then try to contact whoever prepared this text and ask if high-resolution images are availible you could work from. (And all of this provides a good argument why I'd like to preserve hi-res images somewhere for the text I'm working on.) Andrew On Sun, 9 Jul 2006, Jacqulyn Perry wrote: > Hi Andrew! > That sounds like a great idea, and a sensible compromise! Is there a way I could help with this? I'm not sure HOW I could help, I don't have access to the books, as I seriously doubt I could get my hands on a copy of any of these books-most are collectors items and way outside my budget. > > In the meantime, I still have the re-worked images I mentioned, which I would very much like to pass on to you or whomever. They are the same ones in the original file, just re-worked a little so you can actually see them, rather than a dark blur. > > Leigh From Bowerbird at aol.com Sun Jul 9 09:31:49 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Jul 9 09:31:58 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations Message-ID: <3f4.63db38f.31e28975@aol.com> andrew said: > By far the easiest way for a new volunteer to contribute to > Project Gutenberg is to sign up at Distributed Proofreaders > (pgdp.net) and help with one page at a time. that might be the "easiest" way for a new volunteer to contribute, but it won't make the best use of the skill-set leigh has to offer... instead, leigh, trot on over to distributed proofreaders and find, in the _forums_ there, the group of people who are focused on handling images, and let them know you'd like to go to work... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/e904cf14/attachment.html From urbangleaner56 at yahoo.com Sun Jul 9 13:03:05 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sun Jul 9 13:03:11 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: Message-ID: <20060709200305.43554.qmail@web38505.mail.mud.yahoo.com> My real interest is in working with the images. My problem is I'm not very computer literate-I'm an 'old fashion' painter-so I'm not sure how much I can do with the limited graphics programs I have. I'm pretty sure I can take care of the smudge, but it would require me using Paint to remove the smudge, then printing the image out and doing the retouch by hand-I used to work as a photo re-touch person-then scanning the re-touched image to send back to you. Which I would be glad to do. I've posted the image at an artist website I belong to, that has a VERY active computer graphics forum and asked their advice. I've also asked about a good graphics program. Though from what I've seen so far, most of the images you folks have, at the most just require a little brightening and maybe a tiny bite of color adjustment. That I CAN do with what I have. Anything I CAN'T do, I will say so. I'm sure that Adobe Photoshop would take care of anything at all I would need to do, but due to lack of cash, buying it is out of the question for now. Oh yes, I figured I would need to contact the person who originally did the book, and ask for a high res file of the images. I just wanted someone to see what a difference lightening them, makes. Leigh Andrew Sly wrote: Leigh: How accomplished are you as editing images? If you'd like to help with this particular item I have underway, one image could use a small clean-up where there seems to be a smudge of blue ink. Take a look at http://www.victoria.tc.ca/~sly/pb.htm (See the first "alphabet" image, on the baby's forehead.) This is beyond what I feel comfortable dealing with. If you are interested, I have a 1.4mb png file that I would request to have edited and returned in the same format. Or, were you asking in a more general point of view? By far the easiest way for a new volunteer to contribute to Project Gutenberg is to sign up at Distributed Proofreaders (pgdp.net) and help with one page at a time. And about your re-worked images, it would be best to ask one of the white-washers, (people who actually post files). Perhaps via errata [at] pglaf.org. In that case, please include more full details, such as PG number of the ebook. In this case, I can predict that a likely response would be that lightened images would be welcome if you can make them from an original source. In the case at hand the files included are jpgs. Jpgs use a lossy compression, which means that each time you save them, some information is lost. So if you take a jpg, edit it and save it again, you are probably going to have a file which is both larger in size and poorer in quality than what you started with. If you really want to perserver on this text, you could then try to contact whoever prepared this text and ask if high-resolution images are availible you could work from. (And all of this provides a good argument why I'd like to preserve hi-res images somewhere for the text I'm working on.) Andrew On Sun, 9 Jul 2006, Jacqulyn Perry wrote: > Hi Andrew! > That sounds like a great idea, and a sensible compromise! Is there a way I could help with this? I'm not sure HOW I could help, I don't have access to the books, as I seriously doubt I could get my hands on a copy of any of these books-most are collectors items and way outside my budget. > > In the meantime, I still have the re-worked images I mentioned, which I would very much like to pass on to you or whomever. They are the same ones in the original file, just re-worked a little so you can actually see them, rather than a dark blur. > > Leigh _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------------- Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1?/min. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/38b5c28d/attachment.html From cannona at fireantproductions.com Sun Jul 9 14:46:57 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sun Jul 9 14:54:06 2006 Subject: [gutvol-d] PG Wiki Message-ID: <000001c6a3a2$31797270$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Marcello and all. Any word on when the static content on Gutenberg.org will be done away with and replaced by the wiki? Also, congratulations Italy! That was an intense game. :) Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFEsXr7I7J99hVZuJcRAi4SAJ9VGglvtmHeHrqaq8xGjtDgyKTckwCgo2Ow eWxF9+N2OTZk7QHmv6Qof7Y= =/56y -----END PGP SIGNATURE----- From vze3rknp at verizon.net Sun Jul 9 15:01:34 2006 From: vze3rknp at verizon.net (Juliet Sutherland) Date: Sun Jul 9 15:01:36 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <20060709200305.43554.qmail@web38505.mail.mud.yahoo.com> References: <20060709200305.43554.qmail@web38505.mail.mud.yahoo.com> Message-ID: <44B17CBE.6020907@verizon.net> Hi Jacqulyn, You might have a look at The GIMP, which does almost everything photoshop does and is free. There is an Illustrators team at DP that always needs help. I hope that you, and perhaps some of the folks you are in contact with, will join and give us a hand with our illustrations. I hope that eventually DP will have a parallel process that will have experts preparing illustrations while the text is being proofed and formatted. Whether for DP or otherwise, there are several common steps in prepping illustrations for a PG book. First, the originals have to be scanned. Getting good scans of illustrations takes practice and not all of our volunteer content providers are good at it. But everyone does the best they can and we encourage them all to scan illustrations at a decent resolution and, in the case of DP, upload those scans to our server. Another stage of the process is taking the raw scans and making them as good as possible, while still leaving them large for archiving. I usually do this before I upload to the DP server (stuff like deskewing, making sure the colors match as best as possible, etc). But not all volunteers have learned enough about graphics programs to do that part. Then further, PG usually wants illustrations that will look good on a screen, and to keep the overall file sizes down, so there is another stage of processing that reduces the image as much as possible without unacceptable loss of detail. There are definitely tricks to doing that (which I don't know). Often folks will choose to make a smaller version for display within the ebook and a larger one that can be obtained by clicking on the picture. Also, what's considered "reasonable" for size and detail depends to some extent on the book. A children's picture book, or a book about art, can reasonably have larger illustrations than something that was starting with not-so-good B&W photographs poorly printed. We deal with everything from simple line art to steel-cut engravings (very fine detail) to printed color illustrations (needing descreening) to the xyz-gravure stuff that seems to scan beautifully (I don't know what the process is for the various -gravure stuff but it doesn't seem to result in the same kind of screen dots that one sees in most color or B&W photo stuff), to beat-up decorative book covers. There are also illustrations and maps that are too large to be scanned in one piece and need to be put back together. Lots of challenges for people who like to do restoration. JulietS DP Site Admin Jacqulyn Perry wrote: > My real interest is in working with the images. My problem is I'm not > very computer literate-I'm an 'old fashion' painter-so I'm not sure > how much I can do with the limited graphics programs I have. > > I'm pretty sure I can take care of the smudge, but it would require me > using Paint to remove the smudge, then printing the image out and > doing the retouch by hand-I used to work as a photo re-touch > person-then scanning the re-touched image to send back to you. Which I > would be glad to do. > > I've posted the image at an artist website I belong to, that has a > VERY active computer graphics forum and asked their advice. I've also > asked about a good graphics program. Though from what I've seen so > far, most of the images you folks have, at the most just require a > little brightening and maybe a tiny bite of color adjustment. That I > CAN do with what I have. Anything I CAN'T do, I will say so. > > I'm sure that Adobe Photoshop would take care of anything at all I > would need to do, but due to lack of cash, buying it is out of the > question for now. > > Oh yes, I figured I would need to contact the person who originally > did the book, and ask for a high res file of the images. I just wanted > someone to see what a difference lightening them, makes. > > Leigh From urbangleaner56 at yahoo.com Sun Jul 9 20:32:07 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sun Jul 9 20:32:11 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <53a.2df95ce.31e2deeb@aol.com> Message-ID: <20060710033207.33340.qmail@web38506.mail.mud.yahoo.com> Thanks! These are only a few of the re-worked images I have. I still have some left to do, and I need to put all of the re-worked ones in a file to send off after that. Yeah, that's one of the things I'm concerned about, because I don't have the actual book so I can compare to the printed images. So as long as someone who DOES have access to the books can check my work and as long as I'm VERY conservative about what I do to the images-and how much-then we should be golden. Couple of other things... I'm teaching this week at our local art center-kids summer arts program-and wont have much time this coming week, as well as my own projects and job hunting. The other thing is, that I need to educate myself about computer imaging programs ect, because until I did a web search, I had no idea WHAT PNG is, or that computers are capable of 'true color'. I need to talk to my son-in-law and ask him how to go about finding out what my computer and monitor are capable of. So obviously I have some homework ahead of me. Leigh Bowerbird@aol.com wrote: leigh- > It WORKED!!! yes it did, and they look nice. (except it seems you skipped over plate05.) i'll compare them to the original, but so far i'd say you did a good job. -bowerbird __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/d6ddac84/attachment.html From sly at victoria.tc.ca Sun Jul 9 20:40:52 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Sun Jul 9 20:40:54 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <20060710033207.33340.qmail@web38506.mail.mud.yahoo.com> References: <20060710033207.33340.qmail@web38506.mail.mud.yahoo.com> Message-ID: That's a good point. Through being involved with Project Gutenberg, I've learned many different things in more areas than I would have imagined. Andrew On Sun, 9 Jul 2006, Jacqulyn Perry wrote: > The other thing is, that I need to educate myself about computer imaging programs ect, because until I did a web search, I had no idea WHAT PNG is, or that computers are capable of 'true color'. I need to talk to my son-in-law and ask him how to go about finding out what my computer and monitor are capable of. So obviously I have some homework ahead of me. > Leigh > From urbangleaner56 at yahoo.com Sun Jul 9 20:47:02 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sun Jul 9 20:47:05 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: <44B17CBE.6020907@verizon.net> Message-ID: <20060710034702.36120.qmail@web38510.mail.mud.yahoo.com> Oh good! Does the GIMP software have a Help file? I hope? Okay, what is DP? Is it part of PG? I'm really interested in working with childrens book illus, as that is what I have some background in as an artist. After I've learned more about working with computer graphics programs, I would probably be willing to help out with the other types of imaging work as well. Leigh Juliet Sutherland wrote: Hi Jacqulyn, You might have a look at The GIMP, which does almost everything photoshop does and is free. There is an Illustrators team at DP that always needs help. I hope that you, and perhaps some of the folks you are in contact with, will join and give us a hand with our illustrations. I hope that eventually DP will have a parallel process that will have experts preparing illustrations while the text is being proofed and formatted. Whether for DP or otherwise, there are several common steps in prepping illustrations for a PG book. First, the originals have to be scanned. Getting good scans of illustrations takes practice and not all of our volunteer content providers are good at it. But everyone does the best they can and we encourage them all to scan illustrations at a decent resolution and, in the case of DP, upload those scans to our server. Another stage of the process is taking the raw scans and making them as good as possible, while still leaving them large for archiving. I usually do this before I upload to the DP server (stuff like deskewing, making sure the colors match as best as possible, etc). But not all volunteers have learned enough about graphics programs to do that part. Then further, PG usually wants illustrations that will look good on a screen, and to keep the overall file sizes down, so there is another stage of processing that reduces the image as much as possible without unacceptable loss of detail. There are definitely tricks to doing that (which I don't know). Often folks will choose to make a smaller version for display within the ebook and a larger one that can be obtained by clicking on the picture. Also, what's considered "reasonable" for size and detail depends to some extent on the book. A children's picture book, or a book about art, can reasonably have larger illustrations than something that was starting with not-so-good B&W photographs poorly printed. We deal with everything from simple line art to steel-cut engravings (very fine detail) to printed color illustrations (needing descreening) to the xyz-gravure stuff that seems to scan beautifully (I don't know what the process is for the various -gravure stuff but it doesn't seem to result in the same kind of screen dots that one sees in most color or B&W photo stuff), to beat-up decorative book covers. There are also illustrations and maps that are too large to be scanned in one piece and need to be put back together. Lots of challenges for people who like to do restoration. JulietS DP Site Admin Jacqulyn Perry wrote: > My real interest is in working with the images. My problem is I'm not > very computer literate-I'm an 'old fashion' painter-so I'm not sure > how much I can do with the limited graphics programs I have. > > I'm pretty sure I can take care of the smudge, but it would require me > using Paint to remove the smudge, then printing the image out and > doing the retouch by hand-I used to work as a photo re-touch > person-then scanning the re-touched image to send back to you. Which I > would be glad to do. > > I've posted the image at an artist website I belong to, that has a > VERY active computer graphics forum and asked their advice. I've also > asked about a good graphics program. Though from what I've seen so > far, most of the images you folks have, at the most just require a > little brightening and maybe a tiny bite of color adjustment. That I > CAN do with what I have. Anything I CAN'T do, I will say so. > > I'm sure that Adobe Photoshop would take care of anything at all I > would need to do, but due to lack of cash, buying it is out of the > question for now. > > Oh yes, I figured I would need to contact the person who originally > did the book, and ask for a high res file of the images. I just wanted > someone to see what a difference lightening them, makes. > > Leigh _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------------- Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1?/min. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/af27622a/attachment-0001.html From urbangleaner56 at yahoo.com Sun Jul 9 20:50:29 2006 From: urbangleaner56 at yahoo.com (Jacqulyn Perry) Date: Sun Jul 9 20:50:33 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: Message-ID: <20060710035029.19529.qmail@web38514.mail.mud.yahoo.com> LOL! Yeah, this looks like its going to combine some of my favorite things... art, and learning new stuff! Can't get much better than that! Leigh Andrew Sly wrote: That's a good point. Through being involved with Project Gutenberg, I've learned many different things in more areas than I would have imagined. Andrew On Sun, 9 Jul 2006, Jacqulyn Perry wrote: > The other thing is, that I need to educate myself about computer imaging programs ect, because until I did a web search, I had no idea WHAT PNG is, or that computers are capable of 'true color'. I need to talk to my son-in-law and ask him how to go about finding out what my computer and monitor are capable of. So obviously I have some homework ahead of me. > Leigh > _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------------- Do you Yahoo!? Get on board. You're invited to try the new Yahoo! Mail Beta. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060709/d785884c/attachment.html From grythumn at gmail.com Sun Jul 9 21:23:13 2006 From: grythumn at gmail.com (Robert Cicconetti) Date: Sun Jul 9 21:23:15 2006 Subject: [gutvol-d] Comments & Questions About Book Illustration In-Reply-To: <20060710034702.36120.qmail@web38510.mail.mud.yahoo.com> References: <44B17CBE.6020907@verizon.net> <20060710034702.36120.qmail@web38510.mail.mud.yahoo.com> Message-ID: <15cfa2a50607092123g5982d349g3c8cb9233d76de1c@mail.gmail.com> There is some documentation at: http://www.gimp.org/docs/ And a list of books in print at: http://www.gimp.org/books/ I have done most of the image processing on the Beatrix Potter books (Not all, but most of them), and while I'm not 100% happy with them, they are fairly decent. For an example: http://www.gutenberg.org/files/17089/17089-h/17089-h.htm My workflow for the potter books (color, screened images): (Note, this also assumes that you are working within the DP workflow; not everyone chooses to do so.) 1) Scan at 600 DPI 24-bit color, using a scanner profile, and not auto-exposure. (Auto-exposure was causing some hard-to-track-down color shifts.) I have calibrated my scanner with a standard target, but the bigger issue is to get the scans consistent across the whole scan. A rough calibration of your monitor settings is not a bad idea, either. There are a number of websites with color bars to assist this. 2) Load two or three images into Gimp/Photoshop, and play with the settings for that particular book.. radius for gaussian blur in the descreening process, levels, unsharp settings, etc. After determining the settings, I'll load them into an action and apply them to all images. a) Descreen. There are some alternative methods (the russian descreening plugin is pretty good at preserving detail), and some scanner drivers do better than others when descreening at stage 1. Typically, if descreening in software, I'll magnify to 200%, and adjust the gaussian blur radius to just show a slight visible hatching in darker areas. b) Downscale (bicubic) to 300 DPI. This reduces file size and makes image easier to work with, now that we have removed any possibility of moire patterns. (Staying at 600 DPI does not help as we threw away any extra information with the blur.) c) Adjust levels, hue, and saturation. It is best done with a light hand, as it is difficult to tell exactly how the colors looked before they faded. Typically I'll show a sample to a few people as a sanity check at this stage. d) If I don't think the post processor will do one after the final scaling, I will do a moderate sharpening (usually unsharp mask) at this stage to compensate for the blurring effect from steps b and a. (These steps can't be scripted and have to be done to each image by hand.) e) Clean up dust marks, printing artifacts, etc. I haven't found anything particularly effective at repairing misaligned screens (it is usually not visible at the final resolution in any case), but most other problems can be cleaned up. f) Crop, and force remaining background (if any) to white. Remember, your image will be rectangular and displayed against a white background. You can specify an alpha layer in your PNG, but this will not be preserved in the final JPG. g) Save as PNG. At this stage, images will be on the order of 1 MB apiece. I'd prefer to save the raw scans as well, but they are much bigger. The post processor downloads these images, scales them to fit current guidelines (usually 400-600 pixels across), compresses them to an appropriate format (usually JPG for these types of images), and inserts them into the HTML. Most images end up in the 50-100 kb range. R C On 7/9/06, Jacqulyn Perry wrote: > > Oh good! Does the GIMP software have a Help file? I hope? > > Okay, what is DP? Is it part of PG? I'm really interested in working with > childrens book illus, as that is what I have some background in as an > artist. After I've learned more about working with computer graphics > programs, I would probably be willing to help out with the other types of > imaging work as well. > Leigh > > *Juliet Sutherland * wrote: > > Hi Jacqulyn, > > You might have a look at The GIMP, which does almost everything > photoshop does and is free. > > There is an Illustrators team at DP that always needs help. I hope that > you, and perhaps some of the folks you are in contact with, will join > and give us a hand with our illustrations. I hope that eventually DP > will have a parallel process that will have experts preparing > illustrations while the text is being proofed and formatted. > > Whether for DP or otherwise, there are several common steps in prepping > illustrations for a PG book. First, the originals have to be scanned. > Getting good scans of illustrations takes practice and not all of our > volunteer content providers are good at it. But everyone does the best > they can and we encourage them all to scan illustrations at a decent > resolution and, in the case of DP, upload those scans to our server. > Another stage of the process is taking the raw scans and making them as > good as possible, while still leaving them large for archiving. I > usually do this before I upload to the DP server (stuff like deskewing, > making sure the colors match as best as possible, etc). But not all > volunteers have learned enough about graphics programs to do that part. > Then further, PG usually wants illustrations that will look good on a > screen, and to keep the overall file sizes down, so there is another > stage of processing that reduces the image as much as possible without > unacceptable loss of detail. There are definitely tricks to doing that > (which I don't know). Often folks will choose to make a smaller version > for display within the ebook and a larger one that can be obtained by > clicking on the picture. Also, what's considered "reasonable" for size > and detail depends to some extent on the book. A children's picture > book, or a book about art, can reasonably have larger illustrations than > something that was starting with not-so-good B&W photographs poorly > printed. > > We deal with everything from simple line art to steel-cut engravings > (very fine detail) to printed color illustrations (needing descreening) > to the xyz-gravure stuff that seems to scan beautifully (I don't know > what the process is for the various -gravure stuff but it doesn't seem > to result in the same kind of screen dots that one sees in most color or > B&W photo stuff), to beat-up decorative book covers. There are also > illustrations and maps that are too large to be scanned in one piece and > need to be put back together. Lots of challenges for people who like to > do restoration. > > JulietS > DP Site Admin > > > > Jacqulyn Perry wrote: > > > My real interest is in working with the images. My problem is I'm not > > very computer literate-I'm an 'old fashion' painter-so I'm not sure > > how much I can do with the limited graphics programs I have. > > > > I'm pretty sure I can take care of the smudge, but it would require me > > using Paint to remove the smudge, then printing the image out and > > doing the retouch by hand-I used to work as a photo re-touch > > person-then scanning the re-touched image to send back to you. Which I > > would be glad to do. > > > > I've posted the image at an artist website I belong to, that has a > > VERY active computer graphics forum and asked their advice. I've also > > asked about a good graphics program. Though from what I've seen so > > far, most of the images you folks have, at the most just require a > > little brightening and maybe a tiny bite of color adjustment. That I > > CAN do with what I have. Anything I CAN'T do, I will say so. > > > > I'm sure that Adobe Photoshop would take care of anything at all I > > would need to do, but due to lack of cash, buying it is out of the > > question for now. > > > > Oh yes, I figured I would need to contact the person who originally > > did the book, and ask for a high res file of the images. I just wanted > > someone to see what a difference lightening them, makes. > > > > Leigh > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > ------------------------------ > Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates > starting at 1?/min. > > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060710/401b772e/attachment.html From sly at victoria.tc.ca Mon Jul 10 00:49:32 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 00:49:38 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: I've just sent a message to the PG catalog list exploring categorizing posibilities for PG. I've put a copy of it at: http://www.victoria.tc.ca/~sly/pgcat.txt Any extensive discussions might be better placed on the gutcat list. To subscribe, see: http://lists.pglaf.org/listinfo.cgi Andrew From hyphen at hyphenologist.co.uk Mon Jul 10 02:15:14 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Mon Jul 10 02:15:27 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: Message-ID: On Mon, 10 Jul 2006 00:49:32 -0700 (PDT), Andrew Sly wrote: | |I've just sent a message to the PG catalog list exploring |categorizing posibilities for PG. | |I've put a copy of it at: |http://www.victoria.tc.ca/~sly/pgcat.txt | |Any extensive discussions might be better placed |on the gutcat list. To subscribe, see: |http://lists.pglaf.org/listinfo.cgi Just a plea for Dewey Decimal. http://www.oclc.org/dewey/ -- Dave Fawthrop "Intelligent Design?" my knees say *not*. "Intelligent Design?" my back says *not*. More like "Incompetent design". Sig (C) Copyright Public Domain From Bowerbird at aol.com Mon Jul 10 02:43:12 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 10 02:43:24 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <4f8.289c0c6.31e37b30@aol.com> andrew said: > Any extensive discussions might be better placed on the gutcat list. extensive discussions on this topic were already held here on gutvol-d. why go through it all again? and again and again and again? if you don't put this stuff on a wiki, you're just on a merry-go-round... it stifles participation when contributions regularly go down the drain. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060710/dd01bfb3/attachment-0001.html From marcello at perathoner.de Mon Jul 10 04:27:45 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jul 10 04:28:11 2006 Subject: [gutvol-d] PG Wiki In-Reply-To: <000001c6a3a2$31797270$0300a8c0@blackbox> References: <000001c6a3a2$31797270$0300a8c0@blackbox> Message-ID: <44B239B1.4000003@perathoner.de> Aaron Cannon wrote: > Hi Marcello and all. Any word on when the static content on Gutenberg.org > will be done away with and replaced by the wiki? I still have to port the FAQ and then I'd like to export all book reviews also. Currently there's a lot of traffic due to the 1M book drive, I think I'll switch when things return normal. -- Marcello Perathoner webmaster@gutenberg.org From sly at victoria.tc.ca Mon Jul 10 08:32:20 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 08:32:46 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: Message-ID: On Mon, 10 Jul 2006, Dave Fawthrop wrote: > On Mon, 10 Jul 2006 00:49:32 -0700 (PDT), Andrew Sly > wrote: > > | > |I've just sent a message to the PG catalog list exploring > |categorizing posibilities for PG. > | > |I've put a copy of it at: > |http://www.victoria.tc.ca/~sly/pgcat.txt > | > Just a plea for Dewey Decimal. > http://www.oclc.org/dewey/ > As I say in my message: >Dewey-Decimal Classification > >This appeals to me for being strongly language independent. >That is, this could be perhaps the easiest way to classify >PG texts, which could, in the future, be translated into >different interfaces in different languages. > >Drawbacks: >Intellectual rights claims may limit usage. (OCLC claims rights >to use this system and licences it out to libraries.) A while ago, the Internet Public Library had a large index of online books, organized on a dewy-decimal system. I wonder if pressure from OCLC had anything to do with it disappearing. Andrew From joshua at hutchinson.net Mon Jul 10 08:40:58 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Mon Jul 10 08:41:04 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> How is it that the OCLC can enforce such a claim when the DDS was first written in 1870 (according to their website)? Shouldn't it be out of copyright and therefore open for anyone to use? > ----- Original Message ----- > From: "Andrew Sly" > > Drawbacks: > > Intellectual rights claims may limit usage. (OCLC claims rights > > to use this system and licences it out to libraries.) > From sly at victoria.tc.ca Mon Jul 10 08:51:13 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 08:51:15 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <4f8.289c0c6.31e37b30@aol.com> References: <4f8.289c0c6.31e37b30@aol.com> Message-ID: Thanks for trying to help here bb. What I'm doing here is not the same as unfocused, meandering exchange of possible ideas you can often find on a mailing list. (which is sometimes a good thing to have). Instead, I'm trying to move ahead with something that will actually contribute to making PG content more accessible to many people out there. I've already discussed this with Marcello, and we could already "have this stuff on a wiki", but I'm overly cautious, and wanted to get input from other people, that might help implement it in a better way, and also give people an idea of what's coming. Andrew On Mon, 10 Jul 2006 Bowerbird@aol.com wrote: > andrew said: > > Any extensive discussions might be better placed on the gutcat list. > > extensive discussions on this topic were already held here on gutvol-d. > why go through it all again? and again and again and again? > > if you don't put this stuff on a wiki, you're just on a merry-go-round... > > it stifles participation when contributions regularly go down the drain. > > -bowerbird > From hyphen at hyphenologist.co.uk Mon Jul 10 09:01:31 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Mon Jul 10 09:01:46 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: On Mon, 10 Jul 2006 10:40:58 -0500, "Joshua Hutchinson" wrote: |How is it that the OCLC can enforce such a claim when the DDS was first written in 1870 (according to their website)? Shouldn't it be out of copyright and therefore open for anyone to use? | | |> ----- Original Message ----- |> From: "Andrew Sly" |> > Drawbacks: |> > Intellectual rights claims may limit usage. (OCLC claims rights |> > to use this system and licences it out to libraries.) There will be a 1922 version which we could use. -- Dave Fawthrop "Intelligent Design?" my knees say *not*. "Intelligent Design?" my back says *not*. More like "Incompetent design". Sig (C) Copyright Public Domain From sly at victoria.tc.ca Mon Jul 10 09:13:52 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 09:13:55 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: On Mon, 10 Jul 2006, Dave Fawthrop wrote: > On Mon, 10 Jul 2006 10:40:58 -0500, "Joshua Hutchinson" > wrote: > > |How is it that the OCLC can enforce such a claim when the DDS was first written in 1870 (according to their website)? Shouldn't it be out of copyright and therefore open for anyone to use? > | > | > |> ----- Original Message ----- > |> From: "Andrew Sly" > |> > Drawbacks: > |> > Intellectual rights claims may limit usage. (OCLC claims rights > |> > to use this system and licences it out to libraries.) > > There will be a 1922 version which we could use. > Looking at the site: http://www.oclc.org/dewey/ I see this notice at the bottom of the page: All copyright rights in the Dewey Decimal Classification system are owned by OCLC. Dewey, Dewey Decimal Classification, DDC, OCLC and WebDewey are registered trademarks of OCLC. In other words, they are taking everything they can get. DDC is regularly revised. (I believe the 22nd edition is latest) Modern Dewey is significantly different from its original publication. Also, the term is trademarked. However, I'm not saying "No, this is impossible." A good thing about the wiki approach is that it (hopefully) encourages different concurrent approaches. I'm just suggesting that if PG ends up having a high-profile use of DDC, OCLC might object. A drawback of using some old version is consistency with what is current. Not only have many new headings been added over time, but there has been much revising and moving headings from one place to another. From cannona at fireantproductions.com Mon Jul 10 09:36:50 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Mon Jul 10 09:37:05 2006 Subject: [gutvol-d] Categorizing PG content References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: <001301c6a43f$0e2f8cd0$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Here is my $0.02: I think that the Dewey-decimal system is the best of all the options. The main advantage I see it having is that it would be easy to extract and work with when building collections. It would also be easy to use to find books from a particular subject. However, just because we have the Dewey system, I don't think that would mean that we couldn't have a wiki as well. A wiki is great because it would allow readers to categorize books beyond what was offered by Dewey. As far as the legal constraints go, you can see what Wikipedia is doing to over come (or not) this limitation at http://en.wikipedia.org/wiki/Wikipedia:Dewey_Decimal_System Sincerely Aaron Cannon - -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFEsoIvI7J99hVZuJcRAgXlAJ9QaBmGsH715P1IxMx7Hy+jTq5b4wCg423d DvZedGOX4nbqPVsFUord/VI= =p9UN -----END PGP SIGNATURE----- From Bowerbird at aol.com Mon Jul 10 09:57:17 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 10 09:57:23 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <515.37dcd19.31e3e0ed@aol.com> andrew said: > Thanks for trying to help here bb. thanks for recognizing the intent. > I'm trying to move ahead with something that > will actually contribute to making PG content > more accessible to many people out there. that's a great goal. putting your plan on a wiki -- the _plan_, not necessarily the catalog itself, although that might be a rather good idea too -- will ensure that other people will cumulate on it, if your current execution doesn't take it all the way to completion. otherwise, the institutional memory will be lost, and they'll be starting from scratch again, just like you are now. make sure your work _persists_. a wiki will even help in getting new people "up to speed" as the come on-board your effort. it's rather unwieldy to access the archives of these listserves, as well as to keep up with "the current policy" when it necessarily evolves quickly... *** as for dewey decimal and the o.c.l.c., why don't you approach them and tell them you'd like to use their system for p.g., and maybe even ask them to contribute some of their _expertise_ in addition to simply granting permission. i would think that some people over there might be happy to do pro bono for p.g. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060710/363ee4d5/attachment.html From dixonm at pobox.com Mon Jul 10 10:00:15 2006 From: dixonm at pobox.com (Meredith Dixon) Date: Mon Jul 10 10:00:12 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: <44B2879F.7000502@pobox.com> There's also the problem that there's not just one "modern Dewey". The DDC as used outside the U.S. (the UDC) is significantly different from the DDC within the U.S. At least, that was true when I was last a cataloger; I've been out of the field for fifteen years. -- Meredith Dixon Check out *Raven Days* For victims and survivors of bullying at school. And for those who want to help. From joey at joeysmith.com Mon Jul 10 12:09:43 2006 From: joey at joeysmith.com (joey) Date: Mon Jul 10 12:11:34 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: <20060710190942.GL20863@joeysmith.com> I don't think a wiki entry per book is a very elegant or scalable way to approach this. I have an alternate suggestion that I'd like to put together, but won't have anything to show until Friday. From Bowerbird at aol.com Mon Jul 10 13:28:01 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 10 13:28:09 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <24d.d9fdbcc.31e41251@aol.com> joey said: > I don't think a wiki entry per book is > a very elegant or scalable way to approach this. again, i suggested a wiki for _coordination_, not for cataloging per se. there's a whole lot of coordination that needs to be done before you can even begin the cataloging. but having said that, i would agree with joey that it wouldn't be elegant or scalable to have a wiki-page for each book, and i don't think anyone would even suggest that for "a catalog". (you might want to have a separate wiki-page for each book for _discussion_ about that book, but that would be an entirely different animal.) i'd suggest a wiki-page for each "big category" -- e.g., reference, fiction, nonfiction, serials -- with a list of e-book numbers and titles in each. when a page gets too big, split it into sub-pages depending on what kind of split of it makes sense. (that's assuming that a split _does_ make sense.) but again, much of the thought-work on this has already been done previously on this very listserve, so someone should first recover all of that work instead of doing it all over again from scratch... further, it should be possible to leverage some of the work that greg just did in creating the d.v.d. for instance, i draw your attention to the files here: > http://snowy.arsc.alaska.edu/gbn/pgimages/amazon/ there are amazon pages for the penguin classics library; one list is sorted by title, the other list is sorted by author. although the book-links on these copied pages don't work, if you go to the current amazon pages, the links will work. those individual-book pages could be quite useful to you. for instance, the one for "around the world in 80 days" has: > Subjects > > Literature & Fiction > General > Classics > > Literature & Fiction > World Literature > British > 19th Century > > Literature & Fiction > World Literature > French > > Look for similar items by subject Classics > Fiction > French Novel And Short Story > Literature - Classics / Criticism > Literature: Classics > Voyages around the world > 19th century fiction > Classic fiction > Fiction / Classics > French if you were to scrape the amazon page for every e-text that p.g. has, you'd end up with a lot of information to help you create a catalog... perhaps the first thing you need to do is make a skeleton of exactly how you want your catalog to look, and how you want it to behave... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060710/8c016547/attachment.html From sly at victoria.tc.ca Mon Jul 10 14:06:32 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 14:06:36 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <24d.d9fdbcc.31e41251@aol.com> References: <24d.d9fdbcc.31e41251@aol.com> Message-ID: Ok, more clarification then. I'm not talking about trying to make something which strives towards the ideals of traditional library cataloging. Personally, I do not believe that that could happen with a group of volunteers. Please notice that the subject heading used for this discussion uses the word "Categorizing", not "cataloging". I am interested in, as you suggest below, creating broader catagories. And yes, I am also aware of many sources we could look to for some ideas. Probably the most productive would be taking a look at places that have reformatted PG texts and presented them anew, such as Blackmask, or Samizdat. Andrew On Mon, 10 Jul 2006 Bowerbird@aol.com wrote: > again, i suggested a wiki for _coordination_, > not for cataloging per se. there's a whole > lot of coordination that needs to be done > before you can even begin the cataloging. > but having said that, i would agree with joey > that it wouldn't be elegant or scalable to have > a wiki-page for each book, and i don't think > anyone would even suggest that for "a catalog". > > (you might want to have a separate wiki-page > for each book for _discussion_ about that book, > but that would be an entirely different animal.) > > i'd suggest a wiki-page for each "big category" > -- e.g., reference, fiction, nonfiction, serials -- > with a list of e-book numbers and titles in each. > > when a page gets too big, split it into sub-pages > depending on what kind of split of it makes sense. > (that's assuming that a split _does_ make sense.) > > but again, much of the thought-work on this has > already been done previously on this very listserve, > so someone should first recover all of that work > instead of doing it all over again from scratch... > > further, it should be possible to leverage some of > the work that greg just did in creating the d.v.d. > > for instance, i draw your attention to the files here: > > http://snowy.arsc.alaska.edu/gbn/pgimages/amazon/ > > there are amazon pages for the penguin classics library; > one list is sorted by title, the other list is sorted by author. > although the book-links on these copied pages don't work, > if you go to the current amazon pages, the links will work. > > those individual-book pages could be quite useful to you. > for instance, the one for "around the world in 80 days" has: > > > Subjects > > > Literature & Fiction > General > Classics > > > Literature & Fiction > World Literature > British > 19th Century > > > Literature & Fiction > World Literature > French > > > > Look for similar items by subject Classics > > Fiction > > French Novel And Short Story > > Literature - Classics / Criticism > > Literature: Classics > > Voyages around the world > > 19th century fiction > > Classic fiction > > Fiction / Classics > > French > > if you were to scrape the amazon page for every e-text that p.g. has, > you'd end up with a lot of information to help you create a catalog... > > perhaps the first thing you need to do is make a skeleton of exactly > how you want your catalog to look, and how you want it to behave... > > -bowerbird > From Bowerbird at aol.com Mon Jul 10 14:53:23 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 10 14:53:49 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <542.2eb5a78.31e42653@aol.com> andrew said: > And yes, I am also aware of many sources we > could look to for some ideas. scraping will give you more than "ideas"... ;+) > Probably the most productive would be > taking a look at places that have reformatted > PG texts and presented them anew, such as > Blackmask, or Samizdat. fantastic idea! -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060710/3100cd05/attachment.html From greg at durendal.org Mon Jul 10 16:00:08 2006 From: greg at durendal.org (Greg Weeks) Date: Mon Jul 10 16:30:03 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <542.2eb5a78.31e42653@aol.com> References: <542.2eb5a78.31e42653@aol.com> Message-ID: On Mon, 10 Jul 2006, Bowerbird@aol.com wrote: >> Blackmask, or Samizdat. Speaking of Blackmask, has anyone heard anything about them? They are still down from the DMCA takedown and I've not heard anything since. -- Greg Weeks http://durendal.org:8080/greg/ From phil at thalasson.com Mon Jul 10 18:15:44 2006 From: phil at thalasson.com (Philip Baker) Date: Mon Jul 10 18:18:08 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: In article <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com>, Joshua Hutchinson writes >How is it that the OCLC can enforce such a claim when the DDS was first written >in 1870 (according to their website)? Shouldn't it be out of copyright and >therefore open for anyone to use? Simply using the DDS does not necessarily require making a copy of any DDS specification. To give an analogy. You don't breach copyright by following the diet presented a diet book. -- Philip Baker From joshua at hutchinson.net Mon Jul 10 19:18:11 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Mon Jul 10 19:18:17 2006 Subject: [gutvol-d] Copyrighted texts in PG catalog Message-ID: <20060711021812.0C9F99EEB4@ws6-2.us4.outblaze.com> What does the catalog system check for to see if a text is copyrighted? For instance, http://www.gutenberg.org/etext/16697, is copyrighted, and shows up as such in each file, but the catalog page shows that it is NOT copyrighted. Is this something I make have done wrong (or David did wrong in posting) or is it a problem with the catalog system? Thanks, Josh From sly at victoria.tc.ca Mon Jul 10 19:56:03 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Mon Jul 10 19:56:07 2006 Subject: [gutvol-d] Copyrighted texts in PG catalog In-Reply-To: <20060711021812.0C9F99EEB4@ws6-2.us4.outblaze.com> References: <20060711021812.0C9F99EEB4@ws6-2.us4.outblaze.com> Message-ID: That is a good question. Well, I have one guess. When I look at the copyrighted PG texts posted in 2005, it looks like they all have these two lines in the header: ** This is a COPYRIGHTED Project Gutenberg eBook, Details Below ** ** Please follow the copyright guidelines in this file. ** In the text in question, I found this: This is a _copyrighted_ Project Gutenberg eBook, details below. Please follow the copyright guidelines in this file. I would guess that it is something to do with the particular formatting with the **'s I've marked PG#16697 as copyrighted and generated a new page for it. How many others might be like this...? Andrew On Mon, 10 Jul 2006, Joshua Hutchinson wrote: > What does the catalog system check for to see if a text is copyrighted? For instance, http://www.gutenberg.org/etext/16697, is copyrighted, and shows up as such in each file, but the catalog page shows that it is NOT copyrighted. Is this something I make have done wrong (or David did wrong in posting) or is it a problem with the catalog system? > > Thanks, > Josh From gbuchana at teksavvy.com Mon Jul 10 20:18:02 2006 From: gbuchana at teksavvy.com (Gardner Buchanan) Date: Mon Jul 10 20:26:30 2006 Subject: [gutvol-d] Comments & Questions About Book Illustrations In-Reply-To: References: <20060709093716.13839.qmail@web38509.mail.mud.yahoo.com> Message-ID: <44B3186A.8000302@teksavvy.com> Hi Andrew, Andrew Sly wrote: > How accomplished are you as editing images? If you'd like to help > with this particular item I have underway, one image could use a > small clean-up where there seems to be a smudge of blue ink. > Take a look at http://www.victoria.tc.ca/~sly/pb.htm > (See the first "alphabet" image, on the baby's forehead.) > This is beyond what I feel comfortable dealing with. > If you are interested, I have a 1.4mb png file that I would > request to have edited and returned in the same format. > If you have not already found someone with the necessary software, I can take care of that in a couple of secs. It would be best to work from a larger sized image -- I assume the large PNG is basically at the original scanned resolution. Also, if it seems desirable, I can set you up with Photoshop and/or any other Adobe software you might need or want. I work for Adobe and can get anything PG might need. ============================================================ Gardner Buchanan Ottawa, ON FreeBSD: Where you want to go. Today. From marcello at perathoner.de Tue Jul 11 05:43:40 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Jul 11 05:44:08 2006 Subject: [gutvol-d] Copyrighted texts in PG catalog In-Reply-To: <20060711021812.0C9F99EEB4@ws6-2.us4.outblaze.com> References: <20060711021812.0C9F99EEB4@ws6-2.us4.outblaze.com> Message-ID: <44B39CFC.6030108@perathoner.de> Joshua Hutchinson wrote: > What does the catalog system check for to see if a text is > copyrighted? For instance, http://www.gutenberg.org/etext/16697, is > copyrighted, and shows up as such in each file, but the catalog page > shows that it is NOT copyrighted. Is this something I make have done > wrong (or David did wrong in posting) or is it a problem with the > catalog system? if (/COPYRIGHTED Project Gutenberg eBook/i) { $o->{'copyright'} = 1; } See also: http://www.gutenberg.org/howto/header-howto -- Marcello Perathoner webmaster@gutenberg.org From joshua at hutchinson.net Tue Jul 11 05:48:32 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Tue Jul 11 05:48:34 2006 Subject: [gutvol-d] Copyrighted texts in PG catalog Message-ID: <20060711124832.222149E836@ws6-2.us4.outblaze.com> Quick look through my recently posted shows these: 16697 - marked as copyright, but has an extra .htm file. Keep the .html and delete the .htm 16939 16940 16941 16983 (This one also has an extra .htm file. Keep the .html and delete the .htm) 16984 (This one also has an extra .htm file. Keep the .html and delete the .htm) 16985 (This one also has an extra .htm file. Keep the .html and delete the .htm) 16986 (This one also has an extra .htm file. Keep the .html and delete the .htm) 17309 17310 Josh NOTE: For those curious, I believe the extra .htm file came about because we reposted these recently to fix a problem with a missing PGHeader/Footer text and the old scripts generated .htm and the new scripts generate .html > ----- Original Message ----- > From: "Andrew Sly" > To: "Project Gutenberg Volunteer Discussion" > Subject: Re: [gutvol-d] Copyrighted texts in PG catalog > Date: Mon, 10 Jul 2006 19:56:03 -0700 (PDT) > > > That is a good question. > > Well, I have one guess. When I look at the copyrighted PG texts > posted in 2005, it looks like they all have these two lines in > the header: > > ** This is a COPYRIGHTED Project Gutenberg eBook, Details Below ** > ** Please follow the copyright guidelines in this file. ** > > In the text in question, I found this: > > This is a _copyrighted_ Project Gutenberg eBook, details below. Please > follow the copyright guidelines in this file. > > I would guess that it is something to do with the particular > formatting with the **'s > > I've marked PG#16697 as copyrighted and generated a new page > for it. How many others might be like this...? > > Andrew > > On Mon, 10 Jul 2006, Joshua Hutchinson wrote: > > > What does the catalog system check for to see if a text is copyrighted? For > > instance, http://www.gutenberg.org/etext/16697, is copyrighted, and shows up > > as such in each file, but the catalog page shows that it is NOT copyrighted. > > Is this something I make have done wrong (or David did wrong in posting) or > > is it a problem with the catalog system? > > > > Thanks, > > Josh > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From marcello at perathoner.de Tue Jul 11 06:02:29 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Tue Jul 11 06:02:57 2006 Subject: [gutvol-d] Copyrighted texts in PG catalog In-Reply-To: <20060711124832.222149E836@ws6-2.us4.outblaze.com> References: <20060711124832.222149E836@ws6-2.us4.outblaze.com> Message-ID: <44B3A165.9080609@perathoner.de> Joshua Hutchinson wrote: > 16939 > 16940 > 16941 > 16983 (This one also has an extra .htm file. Keep the .html and delete the .htm) > 16984 (This one also has an extra .htm file. Keep the .html and delete the .htm) > 16985 (This one also has an extra .htm file. Keep the .html and delete the .htm) > 16986 (This one also has an extra .htm file. Keep the .html and delete the .htm) > 17309 > 17310 gutenberg=> update books set copyrighted = 1 where pk in (16939,16940,16941,16983,16984,16985,16986,17309,17310); UPDATE 9 gutenberg=> Bibrec pages will get fixed when their caching expires (max. 24 h) or you can rebuild them manually with: http://www.gutenberg.org/etext/16697r (notice the 'r' appended to url) -- Marcello Perathoner webmaster@gutenberg.org From prosfilaes at gmail.com Tue Jul 11 13:07:57 2006 From: prosfilaes at gmail.com (David Starner) Date: Tue Jul 11 13:08:06 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> Message-ID: <6d99d1fd0607111307ic18afd6n8184ed4c5398fc4d@mail.gmail.com> On 7/10/06, Philip Baker wrote: > Simply using the DDS does not necessarily require making a copy of any > DDS specification. To give an analogy. You don't breach copyright by > following the diet presented a diet book. But following a diet is a personal action that doesn't fix anything in a permanent format. Using the DDS to categorize a library is to create something in physical permanent form, that basically embodies the system in such a way that the system could more or less be extracted from our catalog. That's a whole different issue, and I think a judge might well rule in favor of them on it. It is, IMO, creating a legally actionable deriviative work. From klofstrom at gmail.com Tue Jul 11 13:33:46 2006 From: klofstrom at gmail.com (Karen Lofstrom) Date: Tue Jul 11 14:05:35 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <6d99d1fd0607111307ic18afd6n8184ed4c5398fc4d@mail.gmail.com> References: <20060710154059.1A38E9EEE3@ws6-2.us4.outblaze.com> <6d99d1fd0607111307ic18afd6n8184ed4c5398fc4d@mail.gmail.com> Message-ID: <1e8e65080607111333n7eb4f54fna8d41ccdbcb41bfe@mail.gmail.com> Suggestion: have a competition to design an open-source cataloging system for e-books, where there are no physical constraints on "shelving." Publicize it in library schools. Major ego-boo for the teacher/graduate student whose scheme is accepted, free design for PG. -- Zora aka Karen Lofstrom From ricardofdiogo at gmail.com Tue Jul 11 20:07:35 2006 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Tue Jul 11 20:07:37 2006 Subject: [gutvol-d] Copyright question In-Reply-To: References: Message-ID: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> Hi. Is it possible to send PG a collaborative/"distribued" translation (made, for instance, at Wikisource), based on an already PG published eBook? (This wouldn't actually be a self-submitted translation, since Wikisource works based on GFDL... And the original PG etext is already copyright cleared...) Ricardo 2006/7/4, Andrew Sly : > > Copyright laws are different in every country. > > I know that in Canada, the duration of copyright is > determined by the life-span of the creator, regardless > of who actually owns the copyright. I cannot speak for > any other countries. > > You are unlikely to find a useful answer here on the > Project Gutenberg Volunteer Discussion list. For a > list dedicated to discussing copyright issues, see: > http://www.cni.org/forums/cni-copyright/ > > Andrew > > On Tue, 4 Jul 2006, Juhana Sadeharju wrote: > > > > > Hello. Most often I hear that the copyright of the book lasts 80 years > > after the death of author. But it is normal that the copyright is > > transferred to the publisher in the contract. Then why the copyright > > expiration is still tied to the author who don't have the copyright > > anymore? Is this misuse of copyright law? Should author keep the > > copyright (and publisher only license) so that the death+80 rule > > applies? > > > > That is most convenient to publishers, of course, because they > > get the copyright and its expiration is still tied to the author. > > > > In the example case, the book writing contract was made 8 years > > ago and the contract included the second edition published now. > > Because the publisher owns the copyright of the second edition > > already due the contract, the author has never owned the copyright. > > So how in this case the copyright expiration could never be tied to > > the author? > > > > Juhana > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -- ?Vi de que noite ? feita a luz do dia!? (Antero de Quental) D? livros electr?nicos ao Mundo. Ajude em http://www.pgdp.net e em http://dp.rastko.net From sly at victoria.tc.ca Tue Jul 11 21:34:47 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Tue Jul 11 21:34:50 2006 Subject: [gutvol-d] Copyright question In-Reply-To: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> Message-ID: Ricardo: Interesting idea. At one point I had thought about what it would take to set up some kind of distributed translation process. I would say the best way to proceed is to send an email to Greg Newby (gbnewby [at] pglaf.org) asking this question, with complete details of the item in question, and a link to it. And I'm curious, could you let me know what it is too? Andrew On Wed, 12 Jul 2006, Ricardo F Diogo wrote: > Hi. Is it possible to send PG a collaborative/"distribued" translation > (made, for instance, at Wikisource), based on an already PG published > eBook? (This wouldn't actually be a self-submitted translation, since > Wikisource works based on GFDL... And the original PG etext is already > copyright cleared...) > > Ricardo > From gbnewby at pglaf.org Wed Jul 12 00:08:34 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Wed Jul 12 00:08:36 2006 Subject: [gutvol-d] Copyright question In-Reply-To: References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> Message-ID: <20060712070834.GA32363@pglaf.org> On Tue, Jul 11, 2006 at 09:34:47PM -0700, Andrew Sly wrote: > > Ricardo: > > Interesting idea. At one point I had thought about what > it would take to set up some kind of distributed > translation process. > > I would say the best way to proceed is to send an > email to Greg Newby (gbnewby [at] pglaf.org) > asking this question, with complete details of > the item in question, and a link to it. Hi, Ricardo. As long as it's formatted OK for us, we would probably accept it. We don't go for a lot of stuff in a few categories of items (tech docs and religion are two examples), and don't publish PDF-only documents nor, usually, HTML without plain text. You can see more guidance here on formatting: http://www.gutenberg.org/faq and here for our general non-public domain submission guidelines: http://www.gutenberg.org/howto/scopy-howto Note that we don't need the same level of permission letter for a GFDL/CC/etc. free license, but we still like to ask for permission. And, of course, it's still copyrighted. Finally, note that we don't have the personnel to handle frequent updates. For documents in flux, PG is likely not the right destination. I hope this helps, and thanks for your suggestions & ideas. -- Greg > > And I'm curious, could you let me know what > it is too? > > Andrew > > On Wed, 12 Jul 2006, Ricardo F Diogo wrote: > > > Hi. Is it possible to send PG a collaborative/"distribued" translation > > (made, for instance, at Wikisource), based on an already PG published > > eBook? (This wouldn't actually be a self-submitted translation, since > > Wikisource works based on GFDL... And the original PG etext is already > > copyright cleared...) > > > > Ricardo > > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From walter.van.holst at xs4all.nl Wed Jul 12 05:12:40 2006 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Wed Jul 12 05:48:28 2006 Subject: [gutvol-d] Copyright question In-Reply-To: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> Message-ID: <44B4E738.8070501@xs4all.nl> Ricardo F Diogo wrote: > Hi. Is it possible to send PG a collaborative/"distribued" translation > (made, for instance, at Wikisource), based on an already PG published > eBook? (This wouldn't actually be a self-submitted translation, since > Wikisource works based on GFDL... And the original PG etext is already > copyright cleared...) That would require some license from the translators that would fit with PG. In several countries you have copyrights for translators. Regards, Walter From ricardofdiogo at gmail.com Wed Jul 12 05:56:42 2006 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Wed Jul 12 05:56:44 2006 Subject: [gutvol-d] Copyright question In-Reply-To: <20060712070834.GA32363@pglaf.org> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <20060712070834.GA32363@pglaf.org> Message-ID: <9c6138c50607120556t35cab646qbf40781c1c4c13fc@mail.gmail.com> Hi again 2006/7/12, Greg Newby : > On Tue, Jul 11, 2006 at 09:34:47PM -0700, Andrew Sly wrote: > > > > Ricardo: > > > > Interesting idea. At one point I had thought about what > > it would take to set up some kind of distributed > > translation process. > > Theoretically, something like PGDP would do (with some improvements and special documentation), since we could make "png's" from the txt files, with a few lines per page for guidance. Project comments/discussions could support the translation and the Post-Processor would need to check not only the format but also the consistency of the translation. It would be definitely harder than "simple" proofing, but.... it would be possible. This system could also be used not only for translations but also for other tasks PG could benefit from but PGDP doesn't support. For these other tasks it would be nice to have some sort of a "Distributed Gutenbergers". For instance, due to deep spelling changes, Portuguese ebooks we're publishing now will probably be read only by University teachers/students. (I suppose the same happens with German). Ordinary people simply give up after a paragraph or may think they have lots of mistakes. I'm modernizing myself the spelling of an average etext and pre-calculate the differences in more than two thousand. In a distributed basis it would be a lot easier. This discussion on distributed translation started from a post in the Portuguese PGDP forum. I'm not thinking of any particular translation, Andrew, but perhaps I''ll had _Alice's Adventures in Wonderland_ to the translations in progress section at http://pt.wikisource.org to see what happens. Here's how they do it there: http://pt.wikisource.org/wiki/Predefini%C3%A7%C3%A3o:Lista_dos_Textos_em_Tradu%C3%A7%C3%A3o > > Hi, Ricardo. As long as it's formatted OK for us, we would probably > accept it. We don't go for a lot of stuff in a few categories of items > (tech docs and religion are two examples), and don't publish PDF-only > documents nor, usually, HTML without plain text. > > You can see more guidance here on formatting: Thanks Greg. I'm familiar with PG rules (actually I've already translated some of them and I'm just waiting for PG's wiki to go public in order to add them). > Note that we don't need the same level of permission > letter for a GFDL/CC/etc. free license, but we still > like to ask for permission. And, of course, it's still > copyrighted. > How would this work? Would someone just need to send you an email saying "Greg, I guess the distributed translation made at... is pretty good" so you could ask (eg Wikisource) permission to add it to PG's catalog? Then would you answer your volunteer "Wikisource said it's OK for us to redistribute that translation. You are now allowed to format the Wikisource file according to our rules." And PG license would be something like "Public domain as soon as you do not change this file. Produced by Wikisource".? > Finally, note that we don't have the personnel to > handle frequent updates. For documents in flux, PG > is likely not the right destination. > Yes. I guess that anyone who actually wants to add a distributed translation to PG's collection must perform some kind of a translation project management. The hard decision is, I think, to know when to stop, since translations can be changed _ad aeternum_. But even if the translation suffers deep changes, PG could always add another version. And if only a few changes are made, another edition. We can use PG's wiki itself to make such translations. Hopefully, it will be easier than having to relate with outer projects. (Unless there's a specially-designed-for-PG project, like PGDP). > > > > > > And I'm curious, could you let me know what > > it is too? > > > > Andrew > > > > On Wed, 12 Jul 2006, Ricardo F Diogo wrote: > > > > > Hi. Is it possible to send PG a collaborative/"distribued" translation > > > (made, for instance, at Wikisource), based on an already PG published > > > eBook? (This wouldn't actually be a self-submitted translation, since > > > Wikisource works based on GFDL... And the original PG etext is already > > > copyright cleared...) > > > > > > Ricardo > > > > > _______________________________________________ > > gutvol-d mailing list > > gutvol-d@lists.pglaf.org > > http://lists.pglaf.org/listinfo.cgi/gutvol-d > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > -- ?Vi de que noite ? feita a luz do dia!? (Antero de Quental) D? livros electr?nicos ao Mundo. Ajude em http://www.pgdp.net e em http://dp.rastko.net From hart at pglaf.org Wed Jul 12 09:38:47 2006 From: hart at pglaf.org (Michael Hart) Date: Wed Jul 12 09:38:49 2006 Subject: !@!Re: [gutvol-d] Copyright question In-Reply-To: <44B4E738.8070501@xs4all.nl> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <44B4E738.8070501@xs4all.nl> Message-ID: On Wed, 12 Jul 2006, Walter van Holst wrote: > Ricardo F Diogo wrote: >> Hi. Is it possible to send PG a collaborative/"distribued" translation >> (made, for instance, at Wikisource), based on an already PG published >> eBook? (This wouldn't actually be a self-submitted translation, since >> Wikisource works based on GFDL... And the original PG etext is already >> copyright cleared...) > > That would require some license from the translators that would fit with PG. > In several countries you have copyrights for translators. I don't think anything would be required if the translation were made for PG of a work PG already had published, as stated above. It would be implicit in the offering of the translations to PG that it was meant for distribution through the normal PG channels. However, if the translator did want works in translation copyrighted, and distributed though PG that way, this would be the same as with any copyrighted work, and we would need a permission letter stating that this was a copyrighted work for PG ditribution and we would use the normal copyright header/footer. Michael > > Regards, > > Walter > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From sly at victoria.tc.ca Wed Jul 12 13:00:47 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jul 12 13:00:50 2006 Subject: !@!Re: [gutvol-d] Copyright question In-Reply-To: References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <44B4E738.8070501@xs4all.nl> Message-ID: I would think that in some kind of collaberative translation project as discussed, it might be ideal to have a little notice, something along that lines of "By contributing, I agree that all my contributions are released to the public domain (or released under some CC licence, etc.)" For an example of a copyrighted translation of a work that came from PG, take a look at: Szachy i Warcaby: Droga do mistrzostwa http://www.gutenberg.org/etext/15201 A Polish translation of Edward Lasker's Chess and Checkers: The Way to Mastership Andrew On Wed, 12 Jul 2006, Michael Hart wrote: > I don't think anything would be required if the translation were made for PG > of a work PG already had published, as stated above. It would be implicit > in the offering of the translations to PG that it was meant for distribution > through the normal PG channels. However, if the translator did want works > in translation copyrighted, and distributed though PG that way, this would > be the same as with any copyrighted work, and we would need a permission > letter stating that this was a copyrighted work for PG ditribution and we > would use the normal copyright header/footer. > > Michael > > From walter.van.holst at xs4all.nl Wed Jul 12 13:15:17 2006 From: walter.van.holst at xs4all.nl (Walter van Holst) Date: Wed Jul 12 13:16:08 2006 Subject: !@!Re: [gutvol-d] Copyright question In-Reply-To: References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <44B4E738.8070501@xs4all.nl> Message-ID: <44B55855.2030800@xs4all.nl> Michael Hart wrote: > I don't think anything would be required if the translation were made > for PG > of a work PG already had published, as stated above. It would be implicit > in the offering of the translations to PG that it was meant for > distribution > through the normal PG channels. However, if the translator did want works Yes, I would agree that there is a implicit licence. However, as soon as the translation would be printed etc., you might run into issues. I may be a nitpicker, but I'd prefer some clear understanding, for example a CC licence. Regards, Walter From Bowerbird at aol.com Wed Jul 12 14:15:22 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Jul 12 14:15:28 2006 Subject: [gutvol-d] writely and so on Message-ID: <55f.264ae44.31e6c06a@aol.com> given the emergence of rich-text editing on the web, exemplified by writely (which will be widespread soon, since google bought 'em), has distributed proofreaders explored this new possibility yet? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060712/a2383a6f/attachment.html From gbnewby at pglaf.org Wed Jul 12 23:56:46 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Wed Jul 12 23:56:48 2006 Subject: [gutvol-d] DVD: last check In-Reply-To: <53f.2b98e6e.31e00bbe@aol.com> References: <53f.2b98e6e.31e00bbe@aol.com> Message-ID: <20060713065646.GA19596@pglaf.org> Thanks for this - a few good catches. I've uploaded a replacement newdvd.txt -- needed to cut a few to fit on the DVD, and fixed a few minor things. The main thing that should be fixed, but will await a technical fix to the program, is that *8.txt, *7.txt, *0.txt and *.txt are not mutually exclusive (that is, they are all included to match .txt, or their counterparts to match txt/zip). More: On Fri, Jul 07, 2006 at 03:10:54PM -0400, Bowerbird@aol.com wrote: > greg- > > here is a slight reworking of that info-page you made on the d.v.d. > > search for question-marks to find questions i had... > > -bowerbird > > =========================================================== > > "july 2006 special" -- current as of ebook #18739 18815 > =========================================================== > > baseline: everything but the hgp: > > 1-2199 (then skip 2200-2224) > 2225-3500 (then skip 3501-3524) > 3525-11774 (then skip 11775-11799) > 11800-20000 > > =========================================================== > > particular items of interest included that are not text and not html: > > 116 (zip/avi, select "all") -- apollo 11 moon landing movie > 156 (midi, select "all") -- beethoven's 5th symphony audio > 249 (zip/html) -- french cave-paintings pictures > 256 (zip/mpg) -- rotating-earth movie > 3002 (mp3) -- janis ian, society's child audio > 5212-5216 ("all") -- a-bomb videos (was "5212-5215"? -- is 5216 a > compilation?) I ditched #5216 > 9551 ("all") -- human-read sherlock holmes audio > 10177 ("all") -- ride of the valkyries, audio > 17246 ("all") , but it doesn't include all the mp3s -- wrong e-text #? Mistake...I'm not sure what I had in mind. > =========================================================== > > selected top-100 titles to specify as html: > > 11 -- alice in wonderland (does this mean you used #928?) Added 928. > 132 -- art of war (does this mean you used #17405?) Added 17405 > 5000 (da vinci notebooks, html?, complete set of #4998 and #4999?) yes. > 5001 -- einstein's relativity > 5200 -- "metamorphosis" (anything special about this html file?) No, but it's one of my favorite titles :) > 8710 -- dore bible illustrations > 8800 -- dante's divine comedy (only available as html download) Fixed...dropped 8800, added 8779-8799 in plain HTML w/ images. (Also a favorite, though Paradiso is a lot less entertaining than Inferno!) > 9551 -- human-read sherlock holmes > 10681 -- roget's thesaurus (heavily formatted with styles) > 13510 -- "knots, splices and rope work" > > =========================================================== > > a few extras to specify as html: > > 10600 -- kerr's "voyages and travels" (but no images in this file?) I'm not sure what I had in mind... > > illustrated beatrix potter: > 17089, 15575, 15284, 15234, 15137, 15077, 14877, 14872, 14868, 14848, > 14838, 14837, 14814, 14797, 14407, 14304, 14220, 12103 > > the first 20 punch: -- (all of the punch are listed separately below) > 18114, 17994, 17654, 17653, 17634, 17629, 17596, 17471, 17397, 17216, > 16877, 16727, 16717, 16707, 16684, 16673, 16640, 16628, 16619, 16592, Dropped 10 to save space > the sciam (232mb) -- (listed below) -- (so, were all these included on the > dvd?) Dropped about 17 to save space > =========================================================== > > eliminate some titles that are part of series. > these "complete" volumes were skipped, > and their individual volumes were retained.) > > to skip (a total of 245 duplicate "completes"): > ... > > hgp items to skip (the reverse of the first list above): > > 11775-11799 > 3501-3524 -- (the original "4501-3524" was a mistake?) Yes....it's 3501, not 4501 > 2200-2224 > -- Greg From ricardofdiogo at gmail.com Thu Jul 13 01:55:50 2006 From: ricardofdiogo at gmail.com (Ricardo F Diogo) Date: Thu Jul 13 01:55:53 2006 Subject: !@!Re: [gutvol-d] Copyright question In-Reply-To: <44B55855.2030800@xs4all.nl> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <44B4E738.8070501@xs4all.nl> <44B55855.2030800@xs4all.nl> Message-ID: <9c6138c50607130155k5479e855we0742a76ebd2540a@mail.gmail.com> > > On Wed, 12 Jul 2006, Michael Hart wrote: > > I don't think anything would be required if the translation were made for PG > of a work PG already had published, as stated above. It would be implicit > in the offering of the translations to PG that it was meant for distribution > through the normal PG channels. Thing is, when those translations are made in websites that have a GFDL, like Wikisource, I suppose that in order to distribute them we'd have to add the GFDL itself, and PG would have to do the same (right?). But PG has its own licence, and GFDL says no clauses can be added. > However, if the translator did want works > in translation copyrighted, and distributed though PG that way, this would > be the same as with any copyrighted work, and we would need a permission letter Yes, but for a massive/distributed/collaborative translation who would write that letter? Only those who want to keep copyright? Even if one in a hundred? And if s/he doesn't write it? 2006/7/12, Andrew Sly : > I would think that in some kind of collaberative translation > project as discussed, it might be ideal to have a little > notice, something along that lines of "By contributing, > I agree that all my contributions are released to the public > domain (or released under some CC licence, etc.)" > Under US law, if a website's general license is GFDL but for a given project we make such public domain notice, would it be effective? In some countries we can't release books to the public domain. What we could do would be something like "By contributing, you agree that all your contributions are released to the public domain. If that doesn't apply to your country, you agree that all your contributions can be freely distributed, changed... etc." 2006/7/12, Walter van Holst : > Yes, I would agree that there is a implicit licence. However, as soon as > the translation would be printed etc., you might run into issues. I may > be a nitpicker, but I'd prefer some clear understanding, for example a > CC licence. Maybe an understanding between PG and other major projects like Wikimedia could make this issue a lot easier. (A default procedure for PG<-->Wikisource distributed/collaborative translations could save a lot of trouble and increase the number of translations.) -- ?Vi de que noite ? feita a luz do dia!? (Antero de Quental) D? livros electr?nicos ao Mundo. Ajude em http://www.pgdp.net e em http://dp.rastko.net From Bowerbird at aol.com Thu Jul 13 14:42:29 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 13 14:42:34 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <57b.fcf4c8.31e81845@aol.com> karen said: > Suggestion: have a competition to design > an open-source cataloging system for e-books, > where there are no physical constraints on "shelving." > Publicize it in library schools. Major ego-boo for the > teacher/graduate student whose scheme is accepted, > free design for PG. um, i don't know that i'm seeing much quality thinking coming out of the library schools, i am chagrined to say. besides, it's not so much the "design" that is so difficult, but rather the _implementation_, and the grunt work of assigning e-texts within the system. so it'd be far better to have the competition at the _programming_ level... and again, much of the design work has already been done, when this thread had an earlier incantation on this listserve. if no one is willing to check the archives, what's the point?... finally, i'm not sure that y'all understand the major need here. and i'm quite certain that library-school students will miss it. answer this question: why should we categorize the e-texts? i'm serious. formulate an answer. i'll wait... got one? ok, great... if your response runs along the lines of "so end-users can find the book they want, and download it", you're on the wrong path. that's the function catalogs used to serve, in the dead-tree world. after all, since a person had made a trip to a library to get a book, and would have to be making another trip to bring it back, it made a lot of sense for that person to find a book that they would enjoy. in that scenario, the catalog helped avoid the cost of a wrong choice. the physical nature of bound pages creates a situation of obligation. but in our new era of high-bandwidth and terrabyte hard-drives, it's silly for a person to spend even mere seconds trying to decide _whether_or_not_ to download a book. it's _far_ more convenient to download vast portions of the library, since they can have their computer do it automatically while they are partying, or sleeping... even the dial-up people can request the d.v.d., for free, and have the entire p.g. library sitting on their hard-disk in a week or so... not only is it not wise to make people spend any time "choosing", it's at odds with the important concept of _unlimited_distribution_. and that's why the library-school people don't understand this. because unlike them, we _want_ people to take a whole bunch! it's not just that there's "no shortage of shelf-space" with e-books, it's that we have an endless source of production. so take 'em all! we are all still trapped, to a large degree, by our history of scarcity, so it's difficult for us to realize how deeply it pervades our thinking. (especially since we all live in the real world too, where scarcity still is a hard fact of life.) but this is one place where we can shed that... these implications of unlimited-production-and-distribution turn our thinking on its head. instead of helping users choose what to _pick_ in the library, we have to help 'em choose what to _discard_. in many ways, this is a much easier task. human genome project files? ya, you probably won't want 'em. e-texts in a language that you cannot read? you can skip those. text-to-speech files? videos? magazines? maybe yes, maybe no. they start with 20,000 "possibles", weeding 'em out to their taste, thanks to our handy-dandy program, which then auto-downloads the ones that are left, in the background, with zero input needed. at that point, the cost of selecting a book is double-clicking it and starting to read it. and if it doesn't appeal to you, just stop reading and go on to the next one. you don't have much need for a catalog. oh sure, it might still be kind of handy to have be guided to e-texts, so some means of categorizing an e-text as being "similar to" others would be nice. but that's how we need to _approach_ this project, from the get-go, and not from our implicit notions about "a catalog", because those are outdated and irrelevant to the task now at hand. you're barking up the wrong tree if you don't rearrange that thinking. but anyway, as i said, a system of categorization would be handy, and i'll have some work to show in that regard in a separate post. i believe it's important to start out with the philosophical point... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060713/180337d5/attachment.html From jon at noring.name Thu Jul 13 15:28:54 2006 From: jon at noring.name (Jon Noring) Date: Thu Jul 13 15:29:02 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <57b.fcf4c8.31e81845@aol.com> References: <57b.fcf4c8.31e81845@aol.com> Message-ID: <1681483192.20060713162854@noring.name> Bowerbird wrote: > karen said: >>?? Suggestion: have a competition to design >>?? an open-source cataloging system for e-books, >>?? where there are no physical constraints on "shelving." >>?? Publicize it in library schools. Major ego-boo for the >>?? teacher/graduate student whose scheme is accepted, >>?? free design for PG. > answer this question:? why should we categorize the e-texts? Actually, I think what we'd like to do is to "categorize" the texts using one or more categorical systems, and then embed that information right into the book (which is a digital object). This is essentially adding metadata, or what the Yahoo folk call "microformats" (which is a terrible name), right into the object. This is done now in many kinds of digital objects, such as audio, video and some ebook formats. This way no external categorization need to be applied -- it is all recorded internally, meaning each book can become autonomous of the others since it carries its own metadata. Particular "libraries" can build a lookup table of their choosing by simply sniffing through all the texts it holds. It doesn't really matter where the text files are placed or organized in a file structure. Multiple categorization systems can be supported in parallel provided the texts carry the requisite information. In XML, there's a number of ways this info could be embedded. In plain text documents, some sort of machine recognizable "plain text" syntax has to be developed -- it'd be quite simple, actually. I think those who advocate plain text should develop a "plain text" metadata system (such as one based on Dublin Core) to insert somewhere in the file. Jon Noring From Bowerbird at aol.com Thu Jul 13 15:54:56 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 13 15:55:06 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <32d.1305f980.31e82940@aol.com> it's a good thing i don't respond to jon noring any more or i'd just get bogged down in shit like markup and metadata. programming is a whole lot more fun. :+) -bowerbird p.s. luv ya, jon, no, seriously! thanks for all the work you do! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060713/080202d3/attachment.html From jon at noring.name Thu Jul 13 16:07:55 2006 From: jon at noring.name (Jon Noring) Date: Thu Jul 13 16:08:06 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <32d.1305f980.31e82940@aol.com> References: <32d.1305f980.31e82940@aol.com> Message-ID: <292503298.20060713170755@noring.name> bowerbird wrote: > p.s.? luv ya, jon, no, seriously!? thanks for all the work you do! Luv ya, too, buddy. :^) And keep up the work on ZML. As I've noted before, we do need a standardized way to express plain text books. Jon From sly at victoria.tc.ca Thu Jul 13 23:27:33 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Thu Jul 13 23:27:40 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <1681483192.20060713162854@noring.name> References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> Message-ID: Jon: One point to take into account is that the upcoming wiki catgorizing will be flexible, never "finished", changing as needed. Embedding this in the files would take a much larger amount of effort, and remove much of the possibility for collaberative effort. Andrew On Thu, 13 Jul 2006, Jon Noring wrote: > Actually, I think what we'd like to do is to "categorize" the texts > using one or more categorical systems, and then embed that information > right into the book (which is a digital object). > > This is essentially adding metadata, or what the Yahoo folk call > "microformats" (which is a terrible name), right into the object. > This is done now in many kinds of digital objects, such as audio, > video and some ebook formats. > > This way no external categorization need to be applied -- it is all > recorded internally, meaning each book can become autonomous of the > others since it carries its own metadata. Particular "libraries" can > build a lookup table of their choosing by simply sniffing through all > the texts it holds. It doesn't really matter where the text files are > placed or organized in a file structure. Multiple categorization > systems can be supported in parallel provided the texts carry the > requisite information. > From sly at victoria.tc.ca Fri Jul 14 00:01:50 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Fri Jul 14 00:02:40 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <57b.fcf4c8.31e81845@aol.com> References: <57b.fcf4c8.31e81845@aol.com> Message-ID: Hi bb. I'm replying to some statements from a few different messages of yours here. >extensive discussions on this topic were already held here on gutvol-d. >why go through it all again? and again and again and again? You've hinted a few times that I'm just starting from scratch here, ignoring what has already been done. This is not true. I've read with interest the previous discussion you mentioned. However, ideas seem to, all too often, be of the variety that would only be practicable if we had a couple of highly-trained, professional librarians who decided to donate their full-time services to PG. (I can dream can't I?) So, I've actually tried doing something productive. I've spent countless hours editing parts of the PG online catalog (focusing mostly on author headings, having given up on trying to make title statements that would be acceptable to the Library Sciences community) >besides, it's not so much the "design" that is so difficult, >but rather the _implementation_, and the grunt work of >assigning e-texts within the system. so it'd be far better >to have the competition at the _programming_ level... In this you are very correct. "The grunt work" is a very real factor here. What I am hoping is to eventually get these wiki pages working in a way that will _invite_ people to contribute, making it more a collaborative effort. If I just go ahead and do as much as I can myself, there will really be no advantage over what I could have done just in editing the PG online catalog. >answer this question: why should we categorize the e-texts? >if your response runs along the lines of "so end-users can find >the book they want, and download it", you're on the wrong path. You then argue that: >in our new era of high-bandwidth and terrabyte hard-drives, >it's silly for a person to spend even mere seconds trying to decide >_whether_or_not_ to download a book. it's _far_ more convenient >to download vast portions of the library, since they can have their >computer do it automatically while they are partying, or sleeping... So, let's assume that someone is interested in the Science Fiction books that we've posted a decent number of lately. Should this person have to download a few hundred books and then do his own time-consuming search of these books now on his own system, trying to identify which ones might be science fiction? I think the need for something like categorizing is apparent, because I've seen a decent number of independent web sites which present a subset of PG books relating to a certain subject. Ones that spring to mind are collections of Australaiana, Canadiana, Esperanto-related topics (not necessarily _in_ Esperanto), and books related to the Philippines. Also, not long ago, I had someone ask if there was some way he could look through just 18th century books. I would argue that having general categories was one reason that Blackmask was so popular. >these implications of unlimited-production-and-distribution turn >our thinking on its head. instead of helping users choose what to >_pick_ in the library, we have to help 'em choose what to _discard_. I must disagree here. People by nature, prefer to have a smaller number of choices. (How many people will look at an extensive menu in a restaurant, be intimidated, and just pick something off the small "feature" list?--having worked in such a place, I can tell you: lots of people.) Would you rather have a selection of items in one particular category that may be of interest, or have a massive list where you have to go "Nope, don't want that one. Nope, don't want that one" two-hundred times? >from the get-go, and not from our implicit notions about "a catalog", >because those are outdated and irrelevant to the task now at hand. Careful now. The traditional library catalog is still an extremely useful resource (for those who know how to use it). I might be susceptible to the argument, however, that its limits get stretched uncomfortably trying to describe digital material. Andrew From Bowerbird at aol.com Fri Jul 14 00:15:15 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 00:15:32 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <558.12396c80.31e89e83@aol.com> andrew said: > You've hinted a few times that I'm just starting from > scratch here, ignoring what has already been done. not you in particular. all of us in general. :+) it seems to me that _you_ are far ahead of the crowd by virtue of the fact that you're actively working on it. and i look forward to your results when you show us. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060714/1c9381a0/attachment-0001.html From joey at joeysmith.com Fri Jul 14 00:34:02 2006 From: joey at joeysmith.com (joey) Date: Fri Jul 14 00:36:30 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <57b.fcf4c8.31e81845@aol.com> References: <57b.fcf4c8.31e81845@aol.com> Message-ID: <20060714073402.GM20863@joeysmith.com> Sorry for the length, everyone, but I wanted to try and cover in words what I was unable to cover in production software. On Thu, Jul 13, 2006 at 05:42:29PM -0400, Bowerbird@aol.com wrote: ... > finally, i'm not sure that y'all understand the major need here. > and i'm quite certain that library-school students will miss it. > > answer this question: why should we categorize the e-texts? > > if your response runs along the lines of "so end-users can find > the book they want, and download it", you're on the wrong path. > > that's the function catalogs used to serve, in the dead-tree world. ... > but in our new era of high-bandwidth and terrabyte hard-drives, > it's silly for a person to spend even mere seconds trying to decide > _whether_or_not_ to download a book. it's _far_ more convenient > to download vast portions of the library, since they can have their > computer do it automatically while they are partying, or sleeping... I disagree. I have a 100Mb/s municipal fiber connection and almost 2 terabytes of disk space available, and "download[ing] vast portions of the library" is not an option for me. I don't find it difficult to imagine that if I have a hard time accepting this answer, there are going to be others who do so as well, with far fewer resources at their command. > even the dial-up people can request the d.v.d., for free, and have > the entire p.g. library sitting on their hard-disk in a week or so... I also don't agree with the implied assertion here that having the full (or even "vast portions of the") library means that users don't want help identifying and locating content within that collection. Of course, this means that we'll want to help people who download the library get the catalog data that matches their portion of the library! > not only is it not wise to make people spend any time "choosing", > it's at odds with the important concept of _unlimited_distribution_. Having a catalog does not equate to making people use it. It's a tool for those who want to make use of it. That said, let's make sure that whatever tool(s) we come up with fit as many of the percieved needs as we possibly can! You clearly have different ideas of the use of a catalog than do I. As you've already enumerated some of the points of *my* use, perhaps you could elaborate on your ideas? (On the other hand, if you already did this, ignore this request. I generally avoid topics once you start weighing in on them, so I may have missed the applicable portions from the last time this topic came up.) --- So, on to my proposal. I had hoped to actually be able to provide a tool demonstrating it, but my day job interfered too much this week to allow me to realize that hope. So instead, let me see if I can lay out the concept. It's based on the tagging system known as the "Debian Package Browser" [1]. Some important parts of the idea that might be missed initially: * Every book gets tagged initially with a placeholder value * Wherever we can identify existing valuable tags, they are added to the initial load. Some examples of tags I'd want include: year published in PG; Author/Creator; Language; LoC Class; Copyright Status (sounding familiar to anyone?) * Tags need to be nestable. This is something the Debian system is not able to support, but I think it's very important. One example Browerbird already pointed out is the Amazon.com categorization scheme. * The default behaviour of the tagging system should be marking which of the existing tags are best applied to this book, but it also needs to be flexible enough to add new tags (and hierarchies thereof). Setting the default behaviour this way is one way of preventing the "del.icio.us syndrome" found in many folksonomies, where there are as many different ways of tagging a piece of content as there are users of the system. * It should be easy, when viewing a particular ebook, to do any of the following actions: view tags already on this book; see a list of "suggested tags", based on a weighted list of tags attached to content that has other tags in common with the current content; view other content tagged in common; add / remove tags. * It needs to be easy to see all content with a particular tag or tagset. I'm envisioning something akin to the Flamenco [2] system here. I envision a lot of things coming out of this effort, including an easier way for people to suggest content for the "Best Of" DVDs so that Greg doesn't have to do so much of the leg-work himself. As people come across suggestions, they tag them, then Greg can just pull a list of ebooks with that tag. I've done some work on a prototype, but as I said, the real world invaded and sapped my time. Then again, I know there are many others on this list that are talented software developers, so perhaps one of you will beat me to it...or propose an even better system. [1]: http://debian.vitavonni.de/packagebrowser/ [2]: http://flamenco.berkeley.edu/ If you'd like to see Flamenco at work, but don't have the resources to set it up yourself, drop me a line off-list and I'll provide you with a URL to one I've setup. From kth at srv.net Fri Jul 14 09:17:08 2006 From: kth at srv.net (Kevin Handy) Date: Fri Jul 14 09:02:17 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <1681483192.20060713162854@noring.name> References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> Message-ID: <44B7C384.5080700@srv.net> Jon Noring wrote: > Actually, I think what we'd like to do is to "categorize" the texts > >using one or more categorical systems, and then embed that information >right into the book (which is a digital object). > > Instead of embedding it into the e-book, I think it would work better as a seperate file. If you embed it into the ebooks, you will need to put it in all the versions (html, text, pdf, tei, etc..), and keep ALL of them up-to-date. Also, you would have to search the entire text of the book to find all the meta-data. As a seperate file, it would also by easier to download just that when you want to be able to do "local" searches, without needing to download the full text of every e-book. Also, if you want to make it "user" editable, however you want to define "user", it would be better as seperate file, so that the original files don't constantly get flagged as modified. Also, make it easy to join the meta-files into a single file (cat *.meta > all.meta would be ideal) so that large number of books could be munged at once, or catalogues of specific groupings could easily be created (i.e. science-fiction/german). This would just require having a header in each file specifying which book it applies to. The format could be text, or XML, or even tei. If you use an XML based version, a text version could be easily created. >This is essentially adding metadata, or what the Yahoo folk call >"microformats" (which is a terrible name), right into the object. >This is done now in many kinds of digital objects, such as audio, >video and some ebook formats. > > Instead of just category, you could store all sorts of information in the "meta" file. Authors name, copyright date(s), categories (science finction, horticulture, cook-book), available formats (text, html, tei, pdf, etc.), language(s), links to web sites, link to author meta file, and any other information like you would like to find in a card catalog, >This way no external categorization need to be applied -- it is all >recorded internally, meaning each book can become autonomous of the >others since it carries its own metadata. Particular "libraries" can >build a lookup table of their choosing by simply sniffing through all >the texts it holds. It doesn't really matter where the text files are >placed or organized in a file structure. Multiple categorization >systems can be supported in parallel provided the texts carry the >requisite information. > > I think that it could become a problem if the meta-data in the different formats were found to be different. Which one has the most correct information, the text version or the html one? >In XML, there's a number of ways this info could be embedded. In plain >text documents, some sort of machine recognizable "plain text" syntax >has to be developed -- it'd be quite simple, actually. I think those >who advocate plain text should develop a "plain text" metadata system >(such as one based on Dublin Core) to insert somewhere in the file. > > If you wanted to search for all polish math books, how would you write the query program so that you would get all of them, without duplicates because of the different formats, and without wasting a lot of CPU cycles. Not all texts have a .txt version, From hart at pglaf.org Fri Jul 14 09:18:12 2006 From: hart at pglaf.org (Michael Hart) Date: Fri Jul 14 09:18:14 2006 Subject: !@!Re: [gutvol-d] Copyright question In-Reply-To: <9c6138c50607130155k5479e855we0742a76ebd2540a@mail.gmail.com> References: <9c6138c50607112007r50633afdtb960db2ffe887b3f@mail.gmail.com> <44B4E738.8070501@xs4all.nl> <44B55855.2030800@xs4all.nl> <9c6138c50607130155k5479e855we0742a76ebd2540a@mail.gmail.com> Message-ID: On Thu, 13 Jul 2006, Ricardo F Diogo wrote: >> >> On Wed, 12 Jul 2006, Michael Hart wrote: >> >> I don't think anything would be required if the translation were made for >> PG >> of a work PG already had published, as stated above. It would be >> implicit >> in the offering of the translations to PG that it was meant for >> distribution >> through the normal PG channels. > > Thing is, when those translations are made in websites that have a > GFDL, like Wikisource, I suppose that in order to distribute them we'd > have to add the GFDL itself, and PG would have to do the same > (right?). But PG has its own licence, and GFDL says no clauses can be > added. Yet one more reason to stay away from such licences, just more trouble. The PG licence works just fine for this, better than GPL, or others. Best to just make sure everyone working on such projects understands and approves the process before they start. Keep it simple. . . . Thanks!!! Give the world eBooks in 2006!!! Michael S. Hart Founder Project Gutenberg Blog at http://hart.pglaf.org > >> However, if the translator did want works >> in translation copyrighted, and distributed though PG that way, this >> would >> be the same as with any copyrighted work, and we would need a permission >> letter > > Yes, but for a massive/distributed/collaborative translation who would > write that letter? Only those who want to keep copyright? Even if one > in a hundred? And if s/he doesn't write it? > > 2006/7/12, Andrew Sly : >> I would think that in some kind of collaberative translation >> project as discussed, it might be ideal to have a little >> notice, something along that lines of "By contributing, >> I agree that all my contributions are released to the public >> domain (or released under some CC licence, etc.)" >> > Under US law, if a website's general license is GFDL but for a given > project we make such public domain notice, would it be effective? > > In some countries we can't release books to the public domain. What we > could do would be something like "By contributing, > you agree that all your contributions are released to the public > domain. If that doesn't apply to your country, you agree that all your > contributions can be freely distributed, changed... etc." > > 2006/7/12, Walter van Holst : > >> Yes, I would agree that there is a implicit licence. However, as soon as >> the translation would be printed etc., you might run into issues. I may >> be a nitpicker, but I'd prefer some clear understanding, for example a >> CC licence. > > Maybe an understanding between PG and other major projects like > Wikimedia could make this issue a lot easier. (A default procedure for > PG<-->Wikisource distributed/collaborative translations could save a > lot of trouble and increase the number of translations.) > > -- > ?Vi de que noite ? feita a luz do dia!? > > (Antero de Quental) > > D? livros electr?nicos ao Mundo. Ajude em http://www.pgdp.net e em > http://dp.rastko.net > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From Bowerbird at aol.com Fri Jul 14 10:19:28 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 10:19:39 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <483.5cfc7ed.31e92c20@aol.com> joey said: > I have a 100Mb/s municipal fiber connection > and almost 2 terabytes of disk space available, > and "download[ing] vast portions of the library" > is not an option for me. well joey, i do look forward to your tool, when you find time to create it, because these general discussions we are having around this topic have a lot of fuzziness about them, which must all be resolved when one starts writing code. so i won't respond to all your points until i can see exactly what you meant by them. but this point here is quite easy to deal with. downloading the project gutenberg library -- even the whole thing -- can be a breeze. first of all, as is always the default with me, i'm only concerned with one version of each -- the "master version", in z.m.l. format -- as the other versions can be spun out of it. second, as i said, it's reasonable to eliminate big classes of e-texts from the downloading, such as the human genome files, audio/video, and books in languages that you don't read... third, there are a lot of duplicate files where pieces of a volume were presented separately, and then the volume as a whole in another file. now that we have the information (thanks greg), those separate-piece files can easily be ignored. fourth, there are some people who will not want the magazines that are being added increasingly. once you've eliminated all of these files from your download queue, you find the list is much smaller. on to the next step... i have written a program that lets a person click one button to start downloading e-texts as a background process on their machine. as soon as one e-text has been completely received, the next one is requested, thus the downloading is _relentless_, and you'd be surprised how fast it goes. for a d.s.l. person like myself, after doing the deletions i mentioned above, it will merely take _a_few_days_ to download all the e-texts. to get the _whole_ library, it might take you a week or so. but remember, during this whole time, you will not have to do a single thing. all you had to do was click that one button. plus, you do have to enter a code every 108 minutes, but it's just this sequence of 6 numbers, no big deal. ;+) > I also don't agree with the implied assertion here > that having the full (or even "vast portions of the") > library means that users don't want help identifying > and locating content within that collection. it was only because i knew some might _infer_ such an "assertion" that i closed my post with the explicit note that this later purpose _is_ still "handy", and therefore should be the _focus_ of this task. did you read that? > I generally avoid topics once you start weighing in on them, > so I may have missed the applicable portions from the last time > this topic came up. well that's a remarkable admission. since i "weigh in" on every topic that is _interesting_ and usually "start" doing so fairly early in the thread, that must mean you're "avoiding" most of the posts, and all the interesting threads. life must be sad. :+) at any rate, i thank you for your candor. perhaps you will thank me for mine when i tell you that if you didn't read what i have written on this topic before, you're likely to take a path that will end up biting your ass. *** anyway, as i read your proposal, it's a social tagging scheme. as a general approach, that would be one way of doing things. again, the specifics are vital, so let us know when you have 'em. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060714/bae55553/attachment.html From ajhaines at shaw.ca Fri Jul 14 10:38:46 2006 From: ajhaines at shaw.ca (Al Haines (shaw)) Date: Fri Jul 14 10:41:42 2006 Subject: [gutvol-d] Page numbers in text e-books Message-ID: <000a01c6a76c$5bd99310$6401a8c0@ahainesp2400> I'm looking for examples of how page numbers are handled/formatted throughout the main portion of a text e-book (that material between its Table of Contents and its Index). Can someone point me to a few examples? I've tried looking myself, but finding a text e-book with page numbers (aside from those in its TOC and Index), in a collection of nearly 19,000 e-books, is kind of needle-in-haystackish . Thanks, Al From Bowerbird at aol.com Fri Jul 14 10:41:56 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 10:42:05 2006 Subject: [gutvol-d] Categorizing PG content Message-ID: <490.5f0e857.31e93164@aol.com> kevin said: > Instead of embedding it into the e-book, > I think it would work better as a seperate file. if i had to choose between the two, i'd agree with you. but there's no reason we can't do it both ways. > If you embed it into the ebooks,? you will need to > put it in all the versions (html, text, pdf, tei, etc..), > and keep ALL of them up-to-date. you put it in the master version (z.m.l.) and then re-propagate the auxiliary versions... > Also, if you want to make it "user" editable, > however you want to define "user", it would be > better as seperate file, so that the original files > don't constantly get flagged as modified. social tagging is an ongoing process, so yes, it doesn't make sense to put that into the file, because your files will be constantly changing. you could roll social tags into your documents on a regular basis, however, and that might be useful. (and every e-text should have a changelog anyway. until you install that, you'll never have a good handle on controlling the contents of your library. never.) but until we see a social tagging system that really works for our purposes, this planning is premature. > make it easy to join the meta-files into a single file > (cat *.meta > all.meta would be ideal) yes, of course. indeed the single-file version should be the one that is public-facing, for easy download. we can give 'em a tool that splits it on their machine. > The format could be text, or XML, or even tei. If you use an > XML based version, a text version could be easily created. at one time, i looked at the x.m.l. version of the catalog. what a bloated crufty mess! kevin, please demonstrate that there is some reality behind what you have said here by showing us "the text version that could be easily created". because in order to make any of these plans really _work_, we will need a simple list of the e-texts. i'd like to see one with about 20,000 lines, each line looking something like this: > 00011 -- alice's advertures in wonderland -- lewis carroll > Instead of just category, you could store all sorts of information > in the "meta" file. Authors name, copyright date(s), categories > (science finction, horticulture, cook-book), available formats > (text, html, tei, pdf, etc.), language(s), links to web sites, > link to author meta file, and any other information > like you would like to find in a card catalog, you'll find much of that data in the existing x.m.l. catalog. so have at it. show us what you can do with it. > Which one has the most correct information, > the text version or the html one? if such a difference comes into existence, you have a bigger problem, which is that your workflow has some bug in it that needs to be fixed. > If you wanted to search for all polish math books, > how would you write the query program so that > you would get all of them, without duplicates because > of the different formats, and without wasting a > lot of CPU cycles. Not all texts have a .txt version good question. got an answer? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060714/f8964029/attachment-0001.html From joshua at hutchinson.net Fri Jul 14 11:50:51 2006 From: joshua at hutchinson.net (Joshua Hutchinson) Date: Fri Jul 14 11:50:53 2006 Subject: [gutvol-d] Page numbers in text e-books Message-ID: <20060714185051.7EB83109ADE@ws6-4.us4.outblaze.com> Here is one I posted yesterday. http://www.gutenberg.org/etext/18827 The HTML and PDF (well, and the TEI master) versions have original page numbers, the plain text does not. Josh > ----- Original Message ----- > From: "Al Haines (shaw)" > To: "Project Gutenberg Volunteer Discussion" > Subject: [gutvol-d] Page numbers in text e-books > Date: Fri, 14 Jul 2006 10:38:46 -0700 > > > I'm looking for examples of how page numbers are handled/formatted throughout > the main portion of a text e-book (that material between its Table of Contents > and its Index). Can someone point me to a few examples? > > I've tried looking myself, but finding a text e-book with page numbers (aside > from those in its TOC and Index), in a collection of nearly 19,000 e-books, is > kind of needle-in-haystackish . > > Thanks, > Al _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > From Bowerbird at aol.com Fri Jul 14 12:15:20 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 12:15:32 2006 Subject: [gutvol-d] Page numbers in text e-books Message-ID: al said: > I'm looking for examples of how page numbers are handled/ > formatted throughout the main portion of a text e-book > (that material between its Table of Contents and its Index).? > Can someone point me to a few examples? i don't know of any _text_ examples in the p.g. library. but here's a demo of one using my zen markup language: > http://snowy.arsc.alaska.edu/bowerbird/myant/myant.zml this example was created for the purpose of coordinating the _scans_ for the pages with the text, so it's a little more broad that just the incorporation of the page-numbers... to see how these individual pages are presented to people for convenient viewing on the web, go to: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp003.html (the number in the u.r.l. indicates the page-number, so you can quickly and easily navigate to any page.) i can't say for sure that this is the _final_ version of how page-oriented markers will be formatted, but the final version won't be much different from this. for instance, here's the break between pages 11 and 12: > she talked. "But first you come down to the > kitchen with me, and have a nice warm bath > > [[9]] > {{myantp010.png}} || My Antonia || > > behind the stove. Bring your things; there's > nobody about. the page-number of a page is put underneath the text for the page, surrounded by double-brackets. (this is irrespective of where it was in the p-book.) and right underneath that is the name of the scan for the next page (in this case, p. 10), surrounded by double-curly-brackets, and the running-head for that page is also included on that same line... (the or-bars indicate left/center/right justification.) in this particular example, there is one blank line above the double-bracketed page-number and one below the double-curly-bracket scan-name. this indicates that the paragraph is continued... in the case where a _new_ paragraph starts at the top of a page, there will be _two_ blank lines above the page-number on the previous page, as well as _two_ blank lines after the scan-name. this is because each page is an entity unto itself. thus, the preceding page needs to know that its bottom line is the concluding line of a paragraph -- because such lines are not to be justified -- and the first line on the following page needs to know that it's the first line of a paragraph, so that it's indented if the user specified such indentation. you can see a case where a new paragraph starts on a page by searching for "{{myantp004.png}}". this situation of new-versus-continued paragraphs is one that even abbyy hasn't quite perfected yet, so it's not at all uncommon to find errors in this regard. and sometimes the decision isn't all that easy to make. for example, look at this scan: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp014.png at the page bottom, the last line, is that the end of a paragraph? now look at this one and answer that same question: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp040.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp074.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp093.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp113.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp123.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp137.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp268.png and this one: > http://snowy.arsc.alaska.edu/bowerbird/myant/myantp407.png (to discover the answers, just add 1 to the number in each u.r.l., which will take you to the next page, where an indentation will indicate that a new paragraph has started, which means that the last line on the previous page was the end of that paragraph. for those who are too lazy to do this, the answers are in the p.s.) and there is another variant on this, the page on which a new _chapter_ starts. search for "{{myantp009.png}}" and you'll find an example of this. whenever this occurs, you'll see there are 3 blank lines above the page-number on the preceding page, and 4 blank lines below the scan-name (and thus above the title of the chapter, as per p.g. standard). pages that start a new _section_ have 3 blank lines above the page-number on the preceding page, and 7 blank lines below the scan-name (and above the title of the section). and yes, of course, the program that presents this z.m.l. file knows how to collapse all of that page-number/scan-name information appropriately, so the person reading the e-text doesn't have to deal with all of that disorienting clutter. the reader gets nicely formatted text -- indented paragraphs if they want, section and chapter headings that are big and bold, if they want, and page-numbers corresponding to the original p-book source, if they want -- but still the task of "authoring" the e-text (doing the zen markup, if you will) is very elementary. even a fourth-grader could do it. -bowerbird p.s. for the answers to the questions posed above, scroll down... respectively, the answers are no/no/no/no/no/yes/yes/yes/yes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060714/f2fc6bc0/attachment.html From Bowerbird at aol.com Fri Jul 14 15:21:18 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 15:21:39 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme Message-ID: <386.82c81af.31e972de@aol.com> appended is a "first pass" at a p.g. categorization scheme. unlike the "tagging" models, i'm using here a "folder" model. inside each folder will be a bunch of alias "files" that point to the e-texts proper. this allows you to put one e-text into multiple folders, by simply generating another alias file. the possibility of nesting folders is also present. as i've said, i think the cataloging tool needs to give people an ability to rule out unwanted classes of e-texts, including: first cut -- by language second cut -- human genome third cut -- audio fourth cut -- video fifth cut -- magazines sixth cut -- copyrighted seventh cut -- reference eighth cut -- religious ninth cut -- poetry tenth cut -- plays eleventh cut -- short story collections twelfth cut -- anthologies and once a person has ruled out whatever they don't want, the downloading of the other e-texts can go automatically. -bowerbird p.s. books top-100 titles selected top-100 titles in html fiction nonfiction reference dictionary encyclopedia thesaurus quotations poetry plays short story collections anthologies religious heavily illustrated beatrix potter children's beatrix potter cookbooks how-to guides magazines punch/punchinello scientific american items of interest that are not text and not html audio music human-read books text-to-speech books video copyrighted work human genome project languages other than english french german italian ... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060714/519f2296/attachment.html From brad at chenla.org Fri Jul 14 19:07:17 2006 From: brad at chenla.org (Brad Collins) Date: Fri Jul 14 19:04:37 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <1681483192.20060713162854@noring.name> (Jon Noring's message of "Thu, 13 Jul 2006 16:28:54 -0600") References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> Message-ID: Jon Noring writes: > In plain text documents, some sort of machine recognizable "plain > text" syntax has to be developed -- it'd be quite simple, > actually. I think those who advocate plain text should develop a > "plain text" metadata system (such as one based on Dublin Core) to > insert somewhere in the file. I would suggest using YAML -- there are a number of applications for processing it, and it can be mapped to dublin core elements easily. The following is a complete YAML Dublin Core document: --- - title: - creator: - subject: - description: - publisher: - contributor: - date: - type: - format: - identifier: - source: - language: - relation: - coverage: - rights: This can easily be parsed, it's human readable and maps well to html/xml structures. b/ -- Brad Collins , Banqwao, Thailand From klofstrom at gmail.com Fri Jul 14 19:24:39 2006 From: klofstrom at gmail.com (Karen Lofstrom) Date: Fri Jul 14 19:24:41 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> Message-ID: <1e8e65080607141924j38a25b8boe6a43c58f8fe6b06@mail.gmail.com> On 7/14/06, Brad Collins wrote: > I would suggest using YAML -- there are a number of applications for > processing it, and it can be mapped to dublin core elements easily. To my untutored eye, it looks good. I wonder if we could make a start at adding that info to all PG texts and ALSO develop an extra-textual cataloguing system that might contain more detail. My dream library would also contain info on how often a text had been downloaded and a rating/recommendation system like the various book and movie rating systems out there. You know, "Readers who liked "Campfire Girls Go Bananas" also liked "Campfire Girls Make Whoopee". -- Karen Lofstrom From brad at chenla.org Fri Jul 14 21:29:48 2006 From: brad at chenla.org (Brad Collins) Date: Fri Jul 14 21:27:05 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme In-Reply-To: <386.82c81af.31e972de@aol.com> (Bowerbird@aol.com's message of "Fri, 14 Jul 2006 18:21:18 EDT") References: <386.82c81af.31e972de@aol.com> Message-ID: Bowerbird@aol.com writes: > as i've said, i think the cataloging tool needs to give people > an ability to rule out unwanted classes of e-texts, including: > > first cut -- by language > second cut -- human genome > third cut -- audio > fourth cut -- video > fifth cut -- magazines > sixth cut -- copyrighted > seventh cut -- reference > eighth cut -- religious > ninth cut -- poetry > tenth cut -- plays > eleventh cut -- short story collections > twelfth cut -- anthologies Functionally this is no different from using Borges' fictional Chinese encyclopedia for dividing different kinds of animals. first cut -- belonging to the Emperor second cut -- embalmed third cut -- tame fourth cut -- sucking pigs fifth cut -- sirens sixth cut -- fabulous seventh cut -- stray dogs eigth cut -- included in the present classification ninth cut -- frenzied tenth cut -- innumerable eleventh cut -- drawn with a very fine camelhair brush twelfth cut -- et cetera thirteenth cut -- having just broken the water pitcher fifteenth cut -- that from a long way off look like flies Your folders are just as semantically flat as tags. You're also mixing different classes of metadata. language : English, French, Finnish etc. format : audio, video, html, plain_text form : prose, poetry, drama, anthology, serial, reference nature : fiction, biography, religious, textbook licence : pg_licence, cc_licence, restricted, gpl etc. Though personally, I would love to be able to rule out all ebooks "having just broken the water pitcher". b/ -- Brad Collins , Banqwao, Thailand From Bowerbird at aol.com Fri Jul 14 22:59:43 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Fri Jul 14 22:59:50 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme Message-ID: <2e3.9e537dd.31e9de4f@aol.com> brad said: > Functionally this is no different from using > Borges' fictional Chinese encyclopedia for > dividing different kinds of animals. > > first cut -- belonging to the Emperor > second cut --? embalmed > third cut --? tame > fourth cut -- sucking pigs > fifth cut -- sirens > sixth cut -- fabulous > seventh cut -- stray dogs > eigth cut -- included in the present classification > ninth cut -- frenzied > tenth cut --? innumerable > eleventh cut -- drawn with a very fine camelhair brush > twelfth cut -- et cetera > thirteenth cut -- having just broken the water pitcher > fifteenth cut -- that from a long way off look like flies ok, brad, you create that categorization scheme, and i'll continue with the work on creating mine (because i'm not looking for help from anyone), and we'll see which one appeals to users more... :+) > Your folders are just as semantically flat as tags. and your "semantically rich" system is a pipedream. :+) > You're also mixing different classes of metadata. which just goes to show how pointless "metadata" is. :+) dublin core. yeah, right... -bowerbird p.s. and if you want the first pass at collaborative filtering, just scrape amazon screens for their recommendations... -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060715/32f2ce97/attachment-0001.html From cannona at fireantproductions.com Sat Jul 15 06:47:26 2006 From: cannona at fireantproductions.com (Aaron Cannon) Date: Sat Jul 15 06:48:57 2006 Subject: [gutvol-d] final DVD up on the torrent tracker, with fixes Message-ID: <000401c6a815$5bad4720$0300a8c0@blackbox> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello all. As you know, we announced a DVD a few days ago and said that it was ready. Well, turns out that it was missing several hundred files. This has been fixed,and you can now download the true final version from http://snowy.arsc.alaska.edu:6969 . Thanks Greg for getting this done, and sorry for any inconvenience that this might have caused anyone. Sincerely Aaron Cannon -- Skype: cannona MSN/Windows Messenger: cannona@hotmail.com (don't send email to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFEuPJLI7J99hVZuJcRAqinAKCBTSpqojhT9Vq+mRM2cOGiXqvUAACg+v0X y/7To5652Hj09weFlCpwxlA= =8F32 -----END PGP SIGNATURE----- From brad at chenla.org Sat Jul 15 07:20:39 2006 From: brad at chenla.org (Brad Collins) Date: Sat Jul 15 07:17:53 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme In-Reply-To: <2e3.9e537dd.31e9de4f@aol.com> (Bowerbird@aol.com's message of "Sat, 15 Jul 2006 01:59:43 EDT") References: <2e3.9e537dd.31e9de4f@aol.com> Message-ID: Bowerbird@aol.com writes: > which just goes to show how pointless "metadata" is. :+) > > dublin core. yeah, right... Okay, perhaps I was a tad harsh. But at the same time you are missing the two points I was trying to make. a) It's not trivial to create a taxonomy because there are so many different ways that people organize things. b) Metadata is simply breaking down information that describes something into well defined key/value pairs which have some commonality When the Internet came along a lot of people (including Yahoo) thought, ah, this ain't so tough, we don't need no stink'n librarians. By and large, those systems suck. Librarians think in long time frames, so often they are a bit behind what is happening on the edge. But that doesn't mean that the centuries of knowledge and experience they have accumulated it worthless. For stuff that has been created in the last five minutes or even fifteen months, tags are a fantastic means of categorizing content. But for anything that has survived longer than that and should be preserved, a solid cataloging regime should be used, supervised by folks who know what they are doing. Even for material that has already been formally cataloged, adding tags will still be a useful means of providing immediate context which a formal catalog can't provide. But I'm sorry. A Zen ML approach to cataloging? That dog don't hunt. b/ -- Brad Collins , Banqwao, Thailand From Bowerbird at aol.com Sat Jul 15 10:58:17 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Jul 15 10:58:27 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme Message-ID: <553.3224165.31ea86b9@aol.com> brad said: > For stuff that has been created in the last five minutes or even > fifteen months, tags are a fantastic means of categorizing content. perhaps we should define our terms. do you think what i'm using is "tags"? > But for anything that has survived longer than that and > should be preserved, a solid cataloging regime should be used, > supervised by folks who know what they are doing. that's an interesting postulate. at least until we ask about where this "solid cataloging regime" is, and who among us is the "folks who know what they are doing" who should be "supervising"... i dunno, i guess the rest of us, presumably. you seem to be laboring under the impression that there are a fleet of highly-trained employees waiting for your leadership, and will jump to the task as soon as they receive instructions... i'm laying out a system that _i_ can create, all by _myself_, if necessary, which can be deployed easily, using software that i will write myself, all by myself, which i am _certain_ will have some usefulness to some end-users out there, without imposing any requirements on p.g. as a whole, and thus is totally "non-exclusive", which means that you are free to do the same thing, and our two methodologies can compete on the level playing-field of real-life users... so, have at it, my friend, have at it... :+) > But I'm sorry.? A Zen ML approach to cataloging?? > That dog don't hunt. then i shouldn't be able to come home with any birds. right? so let us see who can actually feed the end-users and who is left standing at the chalkboard in front of an empty classroom while they go hungry, shall we? ;+) the proof will be in the usage. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060715/265e7f03/attachment.html From joey at joeysmith.com Sun Jul 16 00:46:35 2006 From: joey at joeysmith.com (joey) Date: Sun Jul 16 00:49:26 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme In-Reply-To: <553.3224165.31ea86b9@aol.com> References: <553.3224165.31ea86b9@aol.com> Message-ID: <20060716074635.GN20863@joeysmith.com> On Sat, Jul 15, 2006 at 01:58:17PM -0400, Bowerbird@aol.com wrote: > perhaps we should define our terms. > do you think what i'm using is "tags"? I see no distinction between your model and mine, other than what they're called. From joey at joeysmith.com Sun Jul 16 05:38:35 2006 From: joey at joeysmith.com (joey) Date: Sun Jul 16 05:41:28 2006 Subject: [gutvol-d] Categorizing PG content In-Reply-To: <1e8e65080607141924j38a25b8boe6a43c58f8fe6b06@mail.gmail.com> References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> <1e8e65080607141924j38a25b8boe6a43c58f8fe6b06@mail.gmail.com> Message-ID: <20060716123835.GO20863@joeysmith.com> On Fri, Jul 14, 2006 at 04:24:39PM -1000, Karen Lofstrom wrote: > On 7/14/06, Brad Collins wrote: > > To my untutored eye, it looks good. I wonder if we could make a start > at adding that info to all PG texts and ALSO develop an extra-textual > cataloguing system that might contain more detail. I don't think this is a bad idea at all. > My dream library would also contain info on how often a text had been > downloaded and a rating/recommendation system like the various book > and movie rating systems out there. You know, "Readers who liked > "Campfire Girls Go Bananas" also liked "Campfire Girls Make Whoopee". This is one of the reasons I wanted to go with a tagging/folksonomy model. I've already begun reading some of the available research on how to creative collaborative filtering engines using folksonomies as seeds. From Bowerbird at aol.com Sun Jul 16 10:35:35 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Jul 16 10:35:46 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme Message-ID: <2d6.2e0d6680.31ebd2e7@aol.com> joey said: > I see no distinction between your model and mine, > other than what they're called. i can't really say that there _is_ a difference between them, joey, not until your model is fleshed out with a real live app. in my version, a book is represented by alias files that live in various folders. the names of those folders _could_be_ considered as "tags". but that's not how "tagging systems" are generally architected, not if i understand 'em correctly. what i'm looking to create is a simple system that people can understand implicitly, and operate easily on their machines... how that system is labeled is nothing but a semantic matter, just as long as everyone understands exactly how it works. i'll give people a "starter-set" of folders, but after that, they can develop things from there according to their own aims. if they want a category called "phat books", they tell my app to create a "phat books" folder, then they start checking off the books that they want to have represented in that folder. i see this as more disciplined and restrained than tagging, where idiosyncratic tags are more-or-less routinely applied. but, you know, i guess there's nothing to stop a person from generating many folders with just one or two books in each. at any rate, it is this _personalization_ of the categorization that i see as the main difference between folders and tags. tagging systems usually operate in a social network arena. and this is perhaps an important distinction as well, in that i see my system running as an app on a person's machine. although i think people haven't been too clear on it thus far, it seems that most of you see this operating on a webserver. as an aside, y'all might want to look at ning.com for a means by which you can easily create a social-networking web-app. now, it may well be that the best start-set of folders is created via a tagging system, perhaps one that is generated on a wiki. as i said yesterday, though, i'm more interested in doing this _by_myself_, because it seems too difficult to get any helpers, and too unwieldy to build a system that captures all their help. (it's far easier to just write the program for the end-user and leverage the work that's already done in regard to cataloging. for instance, one of the first cuts will be on the _language_, and i can determine that by simply checking that in the file.) but, speaking of a wiki, i think what you would get from that would be more amenable to my "folder" structure than tags, because each "page" on the wiki would represent a "folder". at least that's how _i_ would organize the wiki. for instance, i'd have a "beatrice potter" wikipage that listed all her e-texts. and i'd have an "esperanto" wikipage listing all those e-texts. (of course, you could also organize the wiki with each page representing an e-text, and then apply the tags on the page. but i think you would find that approach to be cumbersome. again, until i can see an actual working example on your end, it's difficult for me to comment positively or negatively on it.) but since i'm doing my thing by myself, the architecture of my catalog depends on being able to collect almost all of the data _programmatically_, via computerized analysis of the e-texts. the other source of information i will use in generating the starter-set of folders is the catalog-structure richard seltzer has set up over at samizdat.com. (and, if i could recover it, i'd add the one that david moynihan had at blackmask.com.) to sum up, i don't want to spend a lot of time generating the initial catalog structure, and i don't want to spend a lot of time assigning e-texts within that catalog structure. ok? my third concern is that people can modify to their desire. there are other concerns, too, such as being able to capture any additional information that people might contribute in the long run while modifying their catalog (the arena where tags really shine), but my 3 main concerns are the ones listed. also, a main goal of the starter-set is to give end-users a way to quickly eliminate the parts of the library they do not want to have downloaded to their machine, and give the rest of it a basic structure that can be navigated easily and intuitively, and i think the "folder" model qualifies well in that regard... so yes, you might be right that the same program that can administer the folder-structure might be an equivalent of one that can administer a tagging model. or it might not. i know what my app will look like. i need to see the other. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060716/e1ef9d10/attachment.html From marcello at perathoner.de Sun Jul 16 16:59:56 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Sun Jul 16 17:00:36 2006 Subject: [gutvol-d] Categorizing in Wiki In-Reply-To: <20060716123835.GO20863@joeysmith.com> References: <57b.fcf4c8.31e81845@aol.com> <1681483192.20060713162854@noring.name> <1e8e65080607141924j38a25b8boe6a43c58f8fe6b06@mail.gmail.com> <20060716123835.GO20863@joeysmith.com> Message-ID: <44BAD2FC.7020303@perathoner.de> Before everybody goes all warm and fuzzy about his/her pet categorization scheme, let me remind you that the discussion started about how to use the wiki for categorizing. A wiki has no built-in authority control. If we want to end up with useful categories we need to develop a restricted vocabulary. The good news is: if we use one page per category, the vocabulary will build itself. Pages can easily be split or merged whenever the vocabulary changes. Also, it is very easy to harvest sites that already have categorized pg books and convert their data into a wiki list. The easiest way to start is to: 1. Create an account 2. Create a page containing a list of books 3. Add the page to the "Bookshelf" category like here: http://www.gutenberg.org/wiki/Detective_Fiction Remember that this is a wiki. Don't expect things you edit to stay edited. If you want to express a personal opinion use a subpage of your user page like this: http://www.gutenberg.org/wiki/User:Marcello/Marcello's_Tops_and_Flops -- Marcello Perathoner webmaster@gutenberg.org From joey at joeysmith.com Sun Jul 16 17:25:45 2006 From: joey at joeysmith.com (joey) Date: Sun Jul 16 17:28:40 2006 Subject: First Pass at Tagging Site [was Re: [gutvol-d] Categorizing PG content] In-Reply-To: References: Message-ID: <20060717002545.GC22029@joeysmith.com> I've got a first prototype up. Keep in mind I've only spent about half an hour on this so far, but I wanted to get something out so that it's not pure vapour. I don't have a lot of bandwidth on this host, so please don't do anything like trying to index/crawl the site. The code and database are available for anyone who'd like to see it. http://www.joeysmith.com:8080/ Some things I should point out: 1) The database model isn't where I want it yet. It only supports the parent/child level of tag nesting. 2) I only seeded it with 1000 books to start. 3) I seeded the tag list with just some VERY basic tags initially. special/untagged - All books have this tag special/none - This is a placeholder. It will probably not be in future releases. special/all - Also a placeholder. language/ - This is a list of all the languages we have books for. 4) This is still in the beginning stages, so the add and remove tag links do not yet work. From joey at joeysmith.com Sun Jul 16 22:14:47 2006 From: joey at joeysmith.com (joey) Date: Sun Jul 16 22:17:44 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme In-Reply-To: <2d6.2e0d6680.31ebd2e7@aol.com> References: <2d6.2e0d6680.31ebd2e7@aol.com> Message-ID: <20060717051447.GD22029@joeysmith.com> On Sun, Jul 16, 2006 at 01:35:35PM -0400, Bowerbird@aol.com wrote: > joey said: > > I see no distinction between your model and mine, > > other than what they're called. > > i can't really say that there _is_ a difference between them, > joey, not until your model is fleshed out with a real live app. Really, the only difference I can see between your model and mine is that mine is a server-side, collaborative effort, while yours builds destkop-oriented data islands. And I don't see why they would need to be mutually exclusive, either. Perhaps you could populate your client-side app with a list of "folders" and "alias files" generated by polling my server-side app, and perhaps people could publish their data island back to the world as a set of tags. From Bowerbird at aol.com Mon Jul 17 10:50:21 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 17 10:50:32 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme Message-ID: <2fc.28b29100.31ed27dd@aol.com> joey said: > Really, the only difference I can see between your model and mine > is that mine is a server-side, collaborative effort, while yours > builds destkop-oriented data islands. well, you've done some semantic loading there, contrasting "collaborative" with "data-islands". i could counter with "nonpersonalized" versus "individualized", but i don't see much purpose. ultimately, the catalog should work both online and offline. and either approach is capable of it. the real difference is shown when you contrast your system with the wiki system marcello made, which operates in a fashion similar to my system. in your system, tags are applied toward e-texts. in marcello's system (or mine), the e-texts are applied toward categories. that's the difference. again, whether we want to consider that to be a "significant" difference doesn't really matter, as long as we understand that _is_ the difference. all this boils down to the catalogers: would they rather start with a _category_, and apply e-texts, or would they rather start with an e-text and then apply categories. i tend to think it'd be the former. (and certainly, from my standpoint of programmatic cataloging, rather than human-enacted, the former approach is far more easy for me to write code for.) there are still some things you haven't shown us with your example. the first is how nimble it is, once you have scaled it up to the 20,000 e-texts. so i multiplied your 1000-set by 20 times. it's here: > http://snowy.arsc.alaska.edu/bowerbird/misc/joeytags.html at 3.1 megs, it'll be a bit of a pain for dial-up users. the second is the back-end work that you will do to assign tags automatically. (for instance, although i recognize you were just using it for your example, language tags are unnecessary, because that info is already available in the current catalog, and just needs to be converted.) the third is how you're gonna let your catalogers add new tags to your standard set. the fourth -- a big one -- is how the catalog will be made into a public-facing entity end-users navigate. but none of these are especially _difficult_ challenges, so i'm sure you can manage them, and i look forward to experimenting with your system when it's finished. > And I don't see why they would need to be > mutually exclusive, either. sure, the data can be munged to work either way; as usual, it is just a question of _doing_ that work. and giving cataloger volunteers a choice of models would seem to me to be the best way to proceed... but my mission is not to work with volunteers at all, but rather to build a system myself and then put it directly into the hands of end-users. that's my focus. my gut feeling is that a wiki-style approach will be more successful at attracting volunteers than your tag model, but i wouldn't be surprised if i'm wrong. marcello's wiki has an inordinate level of difficulty; i'd allow people to simply enter the e-text number, and then automatically generate all the other info. having them enter that info (and then formatting all the links in wiki-markup) is a recipe for errors, plus it raises the cost of contributing to the point where i think you would have very few volunteers... > Perhaps you could populate your client-side app > with a list of "folders" and "alias files" generated by > polling my server-side app, and perhaps people could > publish their data island back to the world as a set of tags. you are welcome to collect data from my model if you want. and just as i'll be getting info from samizdat and blackmask, i'll also collect data from your server-side approach if i can... except i expect to be finished with my effort (limited as it is) before you even get fully started with your (unlimited) one... -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060717/0581a771/attachment.html From Bowerbird at aol.com Wed Jul 19 00:44:30 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Wed Jul 19 00:44:40 2006 Subject: [gutvol-d] 20, 000 e-texts versus 300 billion dollars (a rather remarkable ant) Message-ID: <48a.62de9c2.31ef3cde@aol.com> juxtaposition with p.g. hitting 20,000 e-texts... the cost of the war will soon hit $300 _billion_. (very soon. it just crossed the $297 billion mark, and a billion dollars just ain't what it used to be.) the power to create, the power to destroy. the human is a rather remarkable ant, eh? -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060719/66ab3e3b/attachment.html From joey at joeysmith.com Thu Jul 20 18:19:32 2006 From: joey at joeysmith.com (joey) Date: Thu Jul 20 18:23:09 2006 Subject: [gutvol-d] first pass at a p.g. categorization scheme In-Reply-To: <2fc.28b29100.31ed27dd@aol.com> References: <2fc.28b29100.31ed27dd@aol.com> Message-ID: <20060721011932.GD7576@joeysmith.com> Once again, I've found that my interest in PG cannot overcome my dislike for certain people who've chosen to involve themselves. I tried to convince myself I could get past it this time, and even got so far as to write some code, but my anger and apathy have won out again. If there's anyone who's interested in the Turbogears project I put up the other day, let me know, but I doubt I'll put any more effort into it at this point. From Bowerbird at aol.com Sat Jul 22 14:19:02 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Jul 22 14:19:13 2006 Subject: [gutvol-d] 1:5 scale Message-ID: <2d3.bb42e4c.31f3f046@aol.com> i'm doing a 1:5 scale reworking of the p.g. library. i have selected e-texts 10000-14000 to work on. one early step is a massive clean-up of the catalog. this might involve serious breakage of compatibility with the existing library. (in other words, you won't necessarily be able to easily import my corrections.) if there is anyone out there who would like me to maintain a limited compatibility, so as to dovetail with their work, please do inform me immediately, and we can have a discussion about cooperation... thank you. -bowerbird p.s. if anyone would like to design the user interface for a program for end-users to access this reworking, feel free to share that too, frontchannel or backchannel. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060722/67c2a051/attachment.html From Bowerbird at aol.com Sat Jul 22 16:27:12 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sat Jul 22 16:27:22 2006 Subject: [gutvol-d] re: 1:5 scale Message-ID: speaking of interfaces, wow, a whole shitload of librarians just got their asses kicked bad... > http://www.amitgupta.info/E41ST/ -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060722/c6f184bc/attachment.html From hyphen at hyphenologist.co.uk Sat Jul 22 21:43:30 2006 From: hyphen at hyphenologist.co.uk (Dave Fawthrop) Date: Sat Jul 22 21:43:43 2006 Subject: [gutvol-d] 1:5 scale In-Reply-To: <2d3.bb42e4c.31f3f046@aol.com> References: <2d3.bb42e4c.31f3f046@aol.com> Message-ID: On Sat, 22 Jul 2006 17:19:02 EDT, Bowerbird@aol.com wrote: |i'm doing a 1:5 scale reworking of the p.g. library. What on earth is one fifth of a library? -- Dave Fawthrop "Intelligent Design?" my knees say *not*. "Intelligent Design?" my back says *not*. More like "Incompetent design". Sig (C) Copyright Public Domain From Bowerbird at aol.com Sun Jul 23 14:08:24 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Sun Jul 23 14:08:34 2006 Subject: [gutvol-d] 1:5 scale Message-ID: <4a7.589392f.31f53f48@aol.com> dave said: > What on earth is one fifth of a library? first and foremost, it means porta-potties instead of rest-rooms. other than that, it means we have all the books, but just up through page 18 with each one, because statistics tell us that that's only how far most people read in a book that they buy... > Most readers do not get past page 18 in a book they have purchased. > http://www.parapublishing.com/sites/para/resources/statistics.cfm many of you might know dan poynter, who collected all these statistics from various places, as the guru of self-publishing... michael, once you get past the frustration of poynter's statistics, i'm positive you will find his web-page to be utterly fascinating... here are a few samples, on publishers: > 18. On average, they pay $465.17 for a simple cover design > to as much as $3,533.26 for a complex cover design. > Typical cover costs range $450 to $3,000. ... > 24. An average of 10 to 15 hours are spent designing a book cover. > 25. On average, 61 hours are spent in the editing process. > 26. On average, 29 hours are spent producing a news release for a new book. really, michael, i look forward to a ton of new posts from you based on the motivation you'll get from all of these statistics... :+) -bowerbird p.s. dave, actually it just means that i started out with 4,000 out of the (roughly) 20,000 e-texts in the p.g. library, so 1/5. for some reason (i'm not sure why), model-builders have always been fond of the 1/5 scale. it's big enough to be "realistic", yet still small enough to be a "model". it makes things very cute, too. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060723/339b1484/attachment.html From kth at srv.net Mon Jul 24 09:24:23 2006 From: kth at srv.net (Kevin Handy) Date: Mon Jul 24 09:09:30 2006 Subject: [gutvol-d] 1:5 scale In-Reply-To: References: <2d3.bb42e4c.31f3f046@aol.com> Message-ID: <44C4F437.8060404@srv.net> Dave Fawthrop wrote: >On Sat, 22 Jul 2006 17:19:02 EDT, Bowerbird@aol.com wrote: > >|i'm doing a 1:5 scale reworking of the p.g. library. > >What on earth is one fifth of a library? > > > Using a smaller font to save disk space? ;) From marcello at perathoner.de Mon Jul 24 09:31:20 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Mon Jul 24 09:31:24 2006 Subject: [gutvol-d] 1:5 scale In-Reply-To: <44C4F437.8060404@srv.net> References: <2d3.bb42e4c.31f3f046@aol.com> <44C4F437.8060404@srv.net> Message-ID: <44C4F5D8.7080409@perathoner.de> Kevin Handy wrote: >> What on earth is one fifth of a library? > > Using a smaller font to save disk space? ;) Some people just don't play with a full deck. -- Marcello Perathoner webmaster@gutenberg.org From Bowerbird at aol.com Mon Jul 24 11:44:42 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Mon Jul 24 11:44:50 2006 Subject: [gutvol-d] re: 1:5 scale Message-ID: <56b.264f307.31f66f1a@aol.com> well, obviously, in prototyping a system, you use just part of the data, not all of it. especially on something like this. :+) i subset the content so i can _examine_ it, molding it into shape manually if necessary. in the process of that, i gain an understanding of what needs to be done, so i can program it, and i develop the first pass at those routines... for instance, as i said, one of the first tasks is whipping the catalog into the shape i want it... that job has already taken me a number of hours, and it's not done yet. the catalog was quite a mess -- and hey, it's just titles and author-names! -- so all told, it'll probably take me some 20 hours, and maybe 30, just for this 1:5 subset... you can see the current state of my clean-up work here: > http://snowy.arsc.alaska.edu/bowerbird/mcl/-catalog there's still work that needs to be done on subtitles, and on the "mirror" titles (which were a total disaster), but other than that, this data is now very consistent... by the time i'm done with this, i'll have good routines to clean it up automatically, to the extent it's possible. so i expect the next 1/5 of the catalog to be cleaned in half the time -- 10-15 hours. during each phase, i'll pick up more information on how to automate it. so the next 1/5 of the catalog will take half the time, about 6-8 hours. and the next 1/5 will take half that, about 3-4 hours. and the last 1/5 will take 1-2 hours. and by then, i'll have some very well-polished routines. so if i decided to do the whole job over, from scratch, for maximal consistency, it'd take about 4-10 hours... this time-savings, via automation, is what you want. that's why you just do a subset of the data in a model. -bowerbird -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060724/22648fad/attachment.html From jon at noring.name Wed Jul 26 10:17:53 2006 From: jon at noring.name (Jon Noring) Date: Wed Jul 26 10:24:19 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? Message-ID: <459964460.20060726111753@noring.name> Everyone, I just posted a TeleRead blog article, which in turn links to the blog article posted by Catherine Hodge at DPP Store, about the World eBook Fair (WeBF). My blog article: http://www.teleread.org/blog/?p=5230 Catherine's blog article: http://dppebookstore.blogspot.com/2006/07/world-ebook-fair-12-million-downloads.html Both Catherine and I are perplexed by the lack of public discussion about the WeBF on the various ebook-related forums such as this one. What are your thoughts? Jon Noring From sly at victoria.tc.ca Wed Jul 26 13:23:30 2006 From: sly at victoria.tc.ca (Andrew Sly) Date: Wed Jul 26 13:23:34 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <459964460.20060726111753@noring.name> References: <459964460.20060726111753@noring.name> Message-ID: On Wed, 26 Jul 2006, Jon Noring wrote: > Both Catherine and I are perplexed by the lack of public discussion > about the WeBF on the various ebook-related forums such as this one. > What are your thoughts? > Ok, since you ask, I'll share my viewpoint. I think that most PG volunteers are aware that PG texts are widely reused, reformatted, and re-presented, on many websites (and sometimes in print, as well). The WEBF can be seen as just one more of these instances. And yes, I know that it also contains many texts from other collections. For years there has been thousands of other texts online that cannot be found in PG. Again, the WEBF is just one more of these instances. What I do give it credit for is good marketing. It's like putting up a big sign saying: "Free for a limited time only!" and giving away material which you can find freely any time you want. However, for some people, that might be the best way to get their interest. It certainly does fit into Michael Hart's vision of giving away as many eBooks as possible to as many people as possible in as many ways as possible. Andrew From joey at joeysmith.com Wed Jul 26 23:00:11 2006 From: joey at joeysmith.com (joey) Date: Wed Jul 26 23:04:50 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <459964460.20060726111753@noring.name> References: <459964460.20060726111753@noring.name> Message-ID: <20060727060011.GE7576@joeysmith.com> For my part, I volunteered to help Greg with readingroo.ms administration and then promptly forgot that you can't all see the graphs I can about the amount of throughput that server generated. :) From gbnewby at pglaf.org Thu Jul 27 02:41:16 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jul 27 02:41:18 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <20060727060011.GE7576@joeysmith.com> References: <459964460.20060726111753@noring.name> <20060727060011.GE7576@joeysmith.com> Message-ID: <20060727094116.GB2352@pglaf.org> On Thu, Jul 27, 2006 at 12:00:11AM -0600, joey wrote: > For my part, I volunteered to help Greg with readingroo.ms administration > and then promptly forgot that you can't all see the graphs I can about the > amount of throughput that server generated. :) They're here (though maybe someday they should be password-protected): http://ibis.riseup.net/munin/ms/readingroo.ms.html We maxed at 100Mbps last week, and have been pushing 40-60Mbps daily, with peaks during the daytime in Europe/Asia. -- Greg From marcello at perathoner.de Thu Jul 27 06:15:49 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jul 27 06:15:53 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <459964460.20060726111753@noring.name> References: <459964460.20060726111753@noring.name> Message-ID: <44C8BC85.5060700@perathoner.de> According to worldebookfair.com they serve 1 million ebooks / day. gutenberg.org serves 60.000 ebooks / day. According to alexa? worldebookfair.com gets less traffic than gutenberg.org and still they manage to serve 16 times as many ebooks. I wonder how they do that? ?) http://www.gutenberg.org/internal/stats/alexa user: internal pass: books On the plus side gutenberg.org gets some traffic from worldebookfair.com. This is where people came from in July: Listing the top 30 referring sites by the number of requests, sorted by the number of requests. reqs %reqs site 236070 19.04% http://www.google.com/ 125094 10.09% http://en.wikipedia.org/ 107354 8.66% http://worldebookfair.com/ 57974 4.68% http://search.yahoo.com/ 31210 2.52% http://www.google.co.uk/ 25132 2.03% http://www.promo.net/ 18850 1.52% http://www.google.co.in/ 17347 1.40% http://www.google.ca/ 16011 1.29% http://www.google.de/ 15762 1.27% http://www.stumbleupon.com/ 13664 1.10% http://profile.myspace.com/ 13238 1.07% http://www.google.com.au/ 12807 1.03% http://my.yahoo.com/ 12650 1.02% http://www.google.fr/ 12649 1.02% http://64.233.179.104/ 11854 0.96% http://www.digg.com/ 11694 0.94% http://digg.com/ 9228 0.74% http://www.google.com.ph/ 8621 0.70% http://search.msn.com/ 7801 0.63% http://www.ovelho.com/ 7671 0.62% http://www.worldebookfair.com/ 6568 0.53% http://66.249.93.104/ 6487 0.52% http://oldfashionededucation.com/ 6475 0.52% http://www.google.es/ 6023 0.49% http://www.google.it/ 5894 0.48% http://www.google.pl/ 5854 0.47% http://librivox.org/ 5824 0.47% http://www.google.com.br/ 5751 0.46% http://www.google.nl/ 5106 0.41% http://luminis1.wright.edu/ 413228 33.33% [not listed: 20,347 sites] -- Marcello Perathoner webmaster@gutenberg.org From JBuck814366460 at aol.com Thu Jul 27 07:21:17 2006 From: JBuck814366460 at aol.com (Jared Buck) Date: Thu Jul 27 07:21:24 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <44C8BC85.5060700@perathoner.de> References: <459964460.20060726111753@noring.name> <44C8BC85.5060700@perathoner.de> Message-ID: <44C8CBDD.1040002@aol.com> Marcello Perathoner wrote on 27/07/2006, 6:15 AM: > According to worldebookfair.com they serve 1 million ebooks / day. > > gutenberg.org serves 60.000 ebooks / day. > > According to alexa? worldebookfair.com gets less traffic than > gutenberg.org and still they manage to serve 16 times as many ebooks. I > wonder how they do that? > > ?) http://www.gutenberg.org/internal/stats/alexa > user: internal > pass: books > > > On the plus side gutenberg.org gets some traffic from > worldebookfair.com. This is where people came from in July: > > Listing the top 30 referring sites by the number of requests, sorted by > the number of requests. > > reqs %reqs site > 236070 19.04% http://www.google.com/ > 125094 10.09% http://en.wikipedia.org/ > 107354 8.66% http://worldebookfair.com/ > 57974 4.68% http://search.yahoo.com/ > 31210 2.52% http://www.google.co.uk/ > 25132 2.03% http://www.promo.net/ > 18850 1.52% http://www.google.co.in/ > 17347 1.40% http://www.google.ca/ > 16011 1.29% http://www.google.de/ > 15762 1.27% http://www.stumbleupon.com/ > 13664 1.10% http://profile.myspace.com/ > 13238 1.07% http://www.google.com.au/ > 12807 1.03% http://my.yahoo.com/ > 12650 1.02% http://www.google.fr/ > 12649 1.02% http://64.233.179.104/ > 11854 0.96% http://www.digg.com/ > 11694 0.94% http://digg.com/ > 9228 0.74% http://www.google.com.ph/ > 8621 0.70% http://search.msn.com/ > 7801 0.63% http://www.ovelho.com/ > 7671 0.62% http://www.worldebookfair.com/ > 6568 0.53% http://66.249.93.104/ > 6487 0.52% http://oldfashionededucation.com/ > 6475 0.52% http://www.google.es/ > 6023 0.49% http://www.google.it/ > 5894 0.48% http://www.google.pl/ > 5854 0.47% http://librivox.org/ > 5824 0.47% http://www.google.com.br/ > 5751 0.46% http://www.google.nl/ > 5106 0.41% http://luminis1.wright.edu/ > 413228 33.33% [not listed: 20,347 sites] > > -- > Marcello Perathoner > webmaster@gutenberg.org > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d > I do say, that's good word of mouth to get over 100,000 requests from the book fair site. Of course indeed, a lot of our traffic comes from wikipedia (which has a nice article on PG) and from people searching for books or for PG itself. Any word of mouth is great, no matter where it comes from :) Jared -- . From rnmscott at netspace.net.au Thu Jul 27 07:21:30 2006 From: rnmscott at netspace.net.au (rnmscott@netspace.net.au) Date: Thu Jul 27 07:21:34 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <44C8BC85.5060700@perathoner.de> References: <459964460.20060726111753@noring.name> <44C8BC85.5060700@perathoner.de> Message-ID: <1154010090.44c8cbeae2248@webmail.netspace.net.au> Has it been about the same (gutenberg downloads) while said fair has been on? Quoting Marcello Perathoner : > According to worldebookfair.com they serve 1 million ebooks / day. > > gutenberg.org serves 60.000 ebooks / day. > > According to alexa? worldebookfair.com gets less traffic than > gutenberg.org and still they manage to serve 16 times as many ebooks. I > wonder how they do that? > ------------------------------------------------------------ This email was sent from Netspace Webmail: http://www.netspace.net.au From marcello at perathoner.de Thu Jul 27 08:30:25 2006 From: marcello at perathoner.de (Marcello Perathoner) Date: Thu Jul 27 08:30:30 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <1154010090.44c8cbeae2248@webmail.netspace.net.au> References: <459964460.20060726111753@noring.name> <44C8BC85.5060700@perathoner.de> <1154010090.44c8cbeae2248@webmail.netspace.net.au> Message-ID: <44C8DC11.2090803@perathoner.de> rnmscott@netspace.net.au wrote: > Has it been about the same (gutenberg downloads) while said fair has been on? I'd say we got about 20% more book downloads in the first two weeks ... its back to normal now. http://www.gutenberg.org/browse/scores/books-downloaded.png -- Marcello Perathoner webmaster@gutenberg.org From gbnewby at pglaf.org Thu Jul 27 11:11:11 2006 From: gbnewby at pglaf.org (Greg Newby) Date: Thu Jul 27 11:11:14 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <44C8BC85.5060700@perathoner.de> References: <459964460.20060726111753@noring.name> <44C8BC85.5060700@perathoner.de> Message-ID: <20060727181111.GA8585@pglaf.org> On Thu, Jul 27, 2006 at 03:15:49PM +0200, Marcello Perathoner wrote: > According to worldebookfair.com they serve 1 million ebooks / day. > > gutenberg.org serves 60.000 ebooks / day. > > According to alexa? worldebookfair.com gets less traffic than > gutenberg.org and still they manage to serve 16 times as many ebooks. I > wonder how they do that? My first guess is that since Alexa is based on sampling, their estimate is incorrect. I've watched traffic from wef since it started, and we've been pushing anywhere from 20Mbps to as high as 100Mbps (with typical daily peaks of 40-60Mbps). That's a lot of data. The last time I heard, UNC (where iBiblio is based) has 600Mbps total capacity, and about 1/3 of that (200Mbps) is allocated to iBiblio, where gutenberg.org lives. Those numbers might have increased in the last few years, however. On the other hand, maybe I'm counting wrong. I'll be looking at the 7GB access_log (currently) in detail once the WEF is over, and maybe Marcello can help so we can compare apples to apples. I have tried to only include successful/completed downloads, and also to only include eBooks (not stuff like front page images and the catalog page), but the count is based on a simple "grep" so could be off. One other factoid: We are using iptables to limit the number of simultaneous connections from a single IP address. (This might make for some unhappy proxy users, unfortunately.) The download total as of right now is just over 19 million. -- Greg > ?) http://www.gutenberg.org/internal/stats/alexa > user: internal > pass: books > > > On the plus side gutenberg.org gets some traffic from > worldebookfair.com. This is where people came from in July: > > Listing the top 30 referring sites by the number of requests, sorted by > the number of requests. > > reqs %reqs site > 236070 19.04% http://www.google.com/ > 125094 10.09% http://en.wikipedia.org/ > 107354 8.66% http://worldebookfair.com/ > 57974 4.68% http://search.yahoo.com/ > 31210 2.52% http://www.google.co.uk/ > 25132 2.03% http://www.promo.net/ > 18850 1.52% http://www.google.co.in/ > 17347 1.40% http://www.google.ca/ > 16011 1.29% http://www.google.de/ > 15762 1.27% http://www.stumbleupon.com/ > 13664 1.10% http://profile.myspace.com/ > 13238 1.07% http://www.google.com.au/ > 12807 1.03% http://my.yahoo.com/ > 12650 1.02% http://www.google.fr/ > 12649 1.02% http://64.233.179.104/ > 11854 0.96% http://www.digg.com/ > 11694 0.94% http://digg.com/ > 9228 0.74% http://www.google.com.ph/ > 8621 0.70% http://search.msn.com/ > 7801 0.63% http://www.ovelho.com/ > 7671 0.62% http://www.worldebookfair.com/ > 6568 0.53% http://66.249.93.104/ > 6487 0.52% http://oldfashionededucation.com/ > 6475 0.52% http://www.google.es/ > 6023 0.49% http://www.google.it/ > 5894 0.48% http://www.google.pl/ > 5854 0.47% http://librivox.org/ > 5824 0.47% http://www.google.com.br/ > 5751 0.46% http://www.google.nl/ > 5106 0.41% http://luminis1.wright.edu/ > 413228 33.33% [not listed: 20,347 sites] > > -- > Marcello Perathoner > webmaster@gutenberg.org > > _______________________________________________ > gutvol-d mailing list > gutvol-d@lists.pglaf.org > http://lists.pglaf.org/listinfo.cgi/gutvol-d From jon at noring.name Thu Jul 27 11:43:46 2006 From: jon at noring.name (Jon Noring) Date: Thu Jul 27 11:44:00 2006 Subject: [gutvol-d] World eBook Fair: 12 million downloads. Anyone notice? In-Reply-To: <20060727181111.GA8585@pglaf.org> References: <459964460.20060726111753@noring.name> <44C8BC85.5060700@perathoner.de> <20060727181111.GA8585@pglaf.org> Message-ID: <475770793.20060727124346@noring.name> Greg wrote: > On the other hand, maybe I'm counting wrong. I'll be > looking at the 7GB access_log (currently) in detail once the WEF is > over, and maybe Marcello can help so we can compare apples > to apples. I have tried to only include successful/completed > downloads, and also to only include eBooks (not stuff like > front page images and the catalog page), but the count > is based on a simple "grep" so could be off. > > One other factoid: We are using iptables to limit the number > of simultaneous connections from a single IP address. (This > might make for some unhappy proxy users, unfortunately.) > > The download total as of right now is just over 19 million. A more telling statistic would be the number of unique downloaders rather than books downloaded. I hypothesize that a sizable chunk of the downloads for the WeBF are being done by a relatively small number of people who are massively downloading the collection, especially the non-PG stuff. Jon Noring From Bowerbird at aol.com Thu Jul 27 12:24:46 2006 From: Bowerbird at aol.com (Bowerbird@aol.com) Date: Thu Jul 27 12:24:58 2006 Subject: [gutvol-d] re: 19 million Message-ID: <584.1b5aff3.31fa6cfe@aol.com> greg said: > The download total as of right now is just over 19 million. 19 million downloads out of that measly p.r.? i'm impressed! i wouldn't have been surprised if nobody even heard about it. these days, if you don't have a multi-million-dollar ad budget, the big boys will usually drown you out with all their big noise. but i guess that word "free" still hasn't lost its magic touch, eh? people are sheep. so good job on the hype, michael! but now get back to work. :+) -bowerbird p.s. i think to become a discussion topic on one certain listserve and another certain blog, you have to be promoting some vapor, and make grandiose promises that e-books will soon cure cancer. reality -- especially .pdf reality -- is too humdrum for some people. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.pglaf.org/private.cgi/gutvol-d/attachments/20060727/c5b4f73f/attachment.html