From: Rui Pedro Mendes Salgueiro Newsgroups: comp.sys.ibm.pc.hardware.chips,comp.sys.intel,alt.folklore.computers Subject: Re: Google buys Deja Followup-To: alt.folklore.computers,comp.sys.ibm.pc.hardware.chips Date: 15 Feb 2001 11:36:47 -0000 Organization: Universidade de Coimbra Lines: 47 Message-ID: <96gf0f$dm8$1@rena.mat.uc.pt> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> NNTP-Posting-Host: localhost X-Trace: rena.mat.uc.pt 982237011 14025 127.0.0.1 (15 Feb 2001 11:36:51 GMT) X-Complaints-To: usenet@rena.mat.uc.pt NNTP-Posting-Date: 15 Feb 2001 11:36:51 GMT User-Agent: tin/pre-1.4-981114 ("The Watchman") (UNIX) (FreeBSD/2.2.7-RELEASE (i386)) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!nntp01.fccn.pt!rena.mat.uc.pt!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75542 In comp.sys.ibm.pc.hardware.chips chrisv wrote: > It seems strange to me that the world's ability to search these > archives is soley dependant on one company's good will (or one > company's attempt to make a profit at it). It should be a government > function, like libraries. As to the archive itself, I think it is usually believed that places like the NSA and similar should have very complete archives. So the government is already taking care of that function. (I wonder if it would be possible to get a copy of that info under the FIO (Freedom Of Information) act.) Having a library (for instance the Library of Congress) get a copy of that archive and implement a search engine of it (you can get slaves^Wgrad students to do the programming) would be nice. But there is a problem: the value/byte of Usenet (specially in recent years) is quite low (myself, I no longer think it is worth enough to use better hardware than a PC with IDE disks *). The hardware and bandwidth expense to create a full archive of Usenet is probably hard to justify. I once talked with someone from the Portuguese National Library and asked him what he tought about archiving the pt.* hierarchy. IIRC, he was not too keen on the idea, due to this problem of the low value/expense ratio. Ob comp.sys.ibm.pc.hardware related material: * An Asus A7V with a Thunderbird 900. This board has 4 IDE buses (2 on the chipset, 2 connected to an on-board Promise ATA/100 controller). With 3 stripped disks (using Vinum under FreeBSD) attached to separate controllers, I measured about 35-40 MB/s sequential write performance, which I suppose is not too bad. Currently that computer has 6 IDE disks (2 filesystems, with 3 disks each). The aggregate bandwidth is still 40 MB/s. If I ever receive the Promise IDE-PCI controllers I ordered, I will try to check if it is possible to improve that by keeping the disks in separate IDE buses. P.S.: Note crosspost and followup. -- http://www.mat.uc.pt/~rps/f1/ an half-tifoso until Canada 2000 Mark Sandman - Morphine, RIP (1952-1999/07/03, Italy) .pt is Portugal| `Whom the gods love die young'-Menander (342-292 BC) Europe | Villeneuve 50-82, Toivonen 56-86, Senna 60-94 ###### From: chrisv Newsgroups: alt.folklore.computers,comp.sys.ibm.pc.hardware.chips Subject: Re: Google buys Deja Message-ID: References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> X-Newsreader: Forte Agent 1.8/32.548 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 9 Date: Fri, 16 Feb 2001 17:07:56 GMT NNTP-Posting-Host: 199.86.4.55 X-Complaints-To: abuse@onvoy.com X-Trace: news7.onvoy.net 982343276 199.86.4.55 (Fri, 16 Feb 2001 11:07:56 CST) NNTP-Posting-Date: Fri, 16 Feb 2001 11:07:56 CST Organization: Onvoy Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!upp1.onvoy!onvoy.com!news7.onvoy.net.POSTED!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75714 Rui Pedro Mendes Salgueiro wrote: >But there is a problem: the value/byte of Usenet (specially in >recent years) is quite low What? Flames and trolls have no value?? 8) ###### From: "Samir Mahendra" Newsgroups: alt.folklore.computers,comp.sys.ibm.pc.hardware.chips References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> Subject: Re: Google buys Deja Lines: 33 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Message-ID: Date: Sun, 18 Feb 2001 12:40:18 GMT NNTP-Posting-Host: 65.0.234.216 X-Complaints-To: abuse@home.net X-Trace: news1.frmt1.sfba.home.com 982500018 65.0.234.216 (Sun, 18 Feb 2001 04:40:18 PST) NNTP-Posting-Date: Sun, 18 Feb 2001 04:40:18 PST Organization: Excite@Home - The Leader in Broadband http://home.com/faster Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!nntp-out.monmouth.com!newspeer.monmouth.com!newshub2.home.com!news.home.com!news1.frmt1.sfba.home.com.POSTED!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75797 "Rui Pedro Mendes Salgueiro" wrote in message news:96gf0f$dm8$1@rena.mat.uc.pt... > In comp.sys.ibm.pc.hardware.chips chrisv wrote: > > It seems strange to me that the world's ability to search these > > archives is soley dependant on one company's good will (or one > > company's attempt to make a profit at it). It should be a government > > function, like libraries. > But there is a problem: the value/byte of Usenet (specially in > recent years) is quite low (myself, I no longer think it is > worth enough to use better hardware than a PC with IDE disks *). > The hardware and bandwidth expense to create a full archive of > Usenet is probably hard to justify. OK, so there's a lot of crap out there, but there's also a lot of good use in having a Usenet archive. I've used Deja's archive extensively doing software development, saving a lot of time and effort in figuring out API's configurations, etc. I've also used the archive for less technical expertise, e.g found lyrics to old songs, found some good hiking trails in my local area, etc. I probably could have found the same information by posting a message to the newsgroup, but why ask the same question again when the answer can be found in an archive. --Samir ###### Newsgroups: alt.folklore.computers,comp.sys.ibm.pc.hardware.chips Subject: Re: Google buys Deja From: otakuvidiot@hotSPAMMYmail.com (O_v) References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> Organization: We Are Suck Inc. Message-ID: <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> User-Agent: Xnews/03.04.11 Lines: 25 X-Complaints-To: abuse@usenetserver.com X-Abuse-Info: Please be sure to forward a copy of ALL headers X-Abuse-Info: Otherwise we will be unable to process your complaint properly. NNTP-Posting-Date: Sun, 18 Feb 2001 23:50:35 EST Date: Mon, 19 Feb 2001 04:50:35 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!sunqbc.risq.qc.ca!newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com!news-sjo.usenetserver.com!news-out.usenetserver.com!news-west.usenetserver.com.POSTED!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75886 Whilst reliving childhood traumas, O_v spied Samir Mahendra's 18 Feb 2001 message... >OK, so there's a lot of crap out there, but there's also a lot of >good use in having a Usenet archive. I've used Deja's archive >extensively doing software development, saving a lot of time >and effort in figuring out API's configurations, etc. Not only that! There's tons of advantages to having a full, researchable, culture-historical database out there. Someone in 2005 researching uses on the Internet in 1995 would have the information right there. Or, how about a work around "popular reaction to President Clinton's impeachment process?" All these would be searchable. Sure, there's a lot of crap on Usenet. But there's also much that's simply invaluable! -- O_v Antique Computer and Videogame Books and Ephemera! http://www.otakuboy.com Featured title: TI-99/4a USER'S REFERENCE GUIDE - 1981 ###### From: Bernie Cosell Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: Mon, 19 Feb 2001 11:19:04 -0500 Organization: Fantasy Farm Fibers Message-ID: References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> X-Newsreader: Forte Agent 1.8/32.548 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: newsabuse@supernews.com Lines: 19 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news-fra1.dfn.de!news0.de.colt.net!colt.net!news.maxwell.syr.edu!feeder.qis.net!sn-xit-02!sn-post-01!supernews.com!news.supernews.com!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75849 Rui Pedro Mendes Salgueiro wrote: } But there is a problem: the value/byte of Usenet (specially in } recent years) is quite low (myself, I no longer think it is } worth enough to use better hardware than a PC with IDE disks *). } The hardware and bandwidth expense to create a full archive of } Usenet is probably hard to justify. Just to provide a data point, some time at the end of last year, a full usenet feed crossed 200 gigs a day. Yes, gigs, and yes, each day. I think that if you don't bother archiving any of the binary postings, though, it is still around only ["only"..:o)] one gig a day which is probably manageable [in a world with $US300 50 gig hard drives]. /Bernie\ -- Bernie Cosell Fantasy Farm Fibers bernie@fantasyfarm.com Pearisburg, VA --> Too many people, too few sheep <-- ###### From: michael.wojcik@merant.com (Michael Wojcik) Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: 19 Feb 2001 22:16:00 GMT Organization: MERANT Inc. Lines: 78 Message-ID: <96s5v00228e@news2.newsguy.com> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> Reply-To: michael.wojcik@merant.com NNTP-Posting-Host: p-730.newsdawg.com X-Newsreader: xrn 9.00 Originator: mww@lorelei.michaelwojcik.org Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!enews.sgi.com!pln-w!spln!dex!extra.newsguy.com!newsp.newsguy.com!mww Xref: chonsp.franklin.ch alt.folklore.computers:75838 In article <904CFEF24otakuvidiothotSPAMMY@64.152.100.100>, otakuvidiot@hotSPAMMYmail.com (O_v) writes: > Whilst reliving childhood traumas, O_v spied Samir Mahendra's 18 Feb 2001 > message... > >OK, so there's a lot of crap out there, but there's also a lot of > >good use in having a Usenet archive. I've used Deja's archive > >extensively doing software development, saving a lot of time > >and effort in figuring out API's configurations, etc. > Not only that! There's tons of advantages to having a full, researchable, > culture-historical database out there. Once Google gets the archives online, you may want to look up the last conversation we had in afc on this topic. Some of us (me, and I think Barb, and probably some others) advocated full archival with perhaps some of the obvious exceptions - exclude binaries, collapse duplicate messages into true crossposts when possible, etc. (Excluding binaries is the big one, of course, since they account for most of the volume. Anything else is just to trim the non-binary archive down a bit.) Eugene Miya described his efforts to find researchers and archivists interested in actually pursuing such a project. He didn't have much luck. He also cited sources with specific figures on Usenet volume and growth patterns. His take (I hope I'm representing it fairly; Eugene, if you're reading this, feel free to correct me) was that while targetted archives are certainly desirable, it's unlikely that anyone will build and maintain a full Usenet archive, and it wouldn't be the best use of available resources anyway. Someone else posted some sample compression figures on a decent-size archive of their feed, using a compressor designed for large text collections (bzip2, I think - a BWT followed by an arithmetic encoder IIRC). They're about what you'd expect; with about two bits of entropy per English letter, and probably around the same for other languages, compression is likely to give you less than an order of magnitude savings. I've come around to Eugene's viewpoint. Selecting as large a list of "interesting" groups as your archive (whatever it may be) can handle will probably still give you a useful view of "Usenet culture". Work in the humanities always involves working with select pieces anyway. If you need to describe aggregate behavior (what fraction of posts about the Florida election favored Bush, say), you can do statistical analyses on what's in your archive and include your margin of error in your report, just as if you were taking a poll. It's hard to imagine a meaningful project that would require the complete text of every single Usenet message (that's a bit of a hard beast to define anyway, thanks to Usenet's noncentralized organization) written since whenever a hypothetical complete archive began. And I have to admit I haven't taken any initiative myself to archive anything. I used to save Usenet posts I read and thought were worth keeping, but when DejaNews started I got lazy. These days I *could* pull a select set of groups I'm interested in from a feed, throw it onto a cheap hard drive, run an indexer in the background, and have an automated process burn a CD of the oldest stuff every once in a while to free up space. All I'd have to do is put a new blank in the burner and file the old one away. I have all the hardware except the drive (and I have a spare 3GB around here somewhere, which would probably be sufficient for the groups I read), and the software is either free (for the indexing engine I'd use Savant, or possibly Tact) or written by me. But I haven't done even that much. How many people have? -- Michael Wojcik michael.wojcik@merant.com Comms Development, MERANT (block capitals are a company mandate) Department of English, Miami University It's like being shot at in an airport with all those guys running around throwing hand grenades. Certain people function better with hand grenades coming from all sides than other people do when the hand grenades are only coming from inside out. -- Dick Selcer, coach of the Cinci Bengals ###### From: Sean Harding Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: Tue, 20 Feb 2001 08:21:30 -0000 Organization: The Dogcow Cartel Message-ID: Sender: Sean Harding References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> User-Agent: tin/1.5.7-20001104 ("Paradise Regained") (UNIX) (Linux/2.4.0-test12 (i586)) X-Complaints-To: newsabuse@supernews.com Lines: 17 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news.maxwell.syr.edu!newsfeed.stanford.edu!sn-xit-01!sn-post-02!sn-post-01!supernews.com!corp.supernews.com!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75879 Michael Wojcik wrote: > probably be sufficient for the groups I read), and the software is > either free (for the indexing engine I'd use Savant, or possibly Tact) Hmm. Interesting. Do you have links for more information on those packages? I wasn't able to turn anything up in a search, but that may be because the strings "savant" and "tact" are too common. I'm looking for decent text indexing software (free because it's for a completely non-commercial personal hobby project) and I haven't had a whole lot of luck... sean -- Sean Harding |"Everyone's got their own TV http://www.dogcow.org/sean/ | with their own episode of Biography." Address in header *is* valid | --The Nields ###### From: michael.wojcik@merant.com (Michael Wojcik) Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: 20 Feb 2001 16:26:33 GMT Organization: MERANT Inc. Lines: 48 Message-ID: <96u5rp01gld@news2.newsguy.com> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> Reply-To: michael.wojcik@merant.com NNTP-Posting-Host: p-472.newsdawg.com X-Newsreader: xrn 9.00 Originator: mww@lorelei.michaelwojcik.org Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!enews.sgi.com!pln-w!spln!dex!extra.newsguy.com!newsp.newsguy.com!mww Xref: chonsp.franklin.ch alt.folklore.computers:75852 In article , Sean Harding writes: > Michael Wojcik wrote: > > probably be sufficient for the groups I read), and the software is > > either free (for the indexing engine I'd use Savant, or possibly Tact) > Hmm. Interesting. Do you have links for more information on those packages? Savant is a word-stemming indexing engine from MIT, used in Bradley Rhodes' Remembrance Engine [1]. RA is a "just-in-time associative memory system" that sits inside an editor (Emacs, generally, though there are RA implementations for various platforms now), watches what you type, and searches indexed documents, displaying the results as you write. Savant is nice because it understands a bunch of different file formats and is easy to automate. Savant is GPL'd, I believe. TACT is an indexer developed at the University of Toronto [2]. It's been several years since I looked at it, and I mentioned it only because I had used it a couple of times. On reviewing the documen- tation I note that it's only for MS-DOS, no longer being actively developed, and apparently not available in source, which makes it rather less desirable. It's interesting for historical reasons, though, and people running MS-DOS or an eumulator might want it just for occasional text-file queries. TACT is listed as shareware, according to the documentation, but there's no requested contribution for its use. The print/CD-ROM manual is $50 (according to the web page - I don't know if it's still available), but there's on-line help; the manual isn't necessary. 1. http://rhodes.www.media.mit.edu/people/rhodes/RA/ 2. http://www.chass.utoronto.ca/cch/tact.html -- Michael Wojcik michael.wojcik@merant.com Comms Development, MERANT (block capitals are a company mandate) Department of English, Miami University He smiled and let his gaze fall to hers, so that her cheek began to glow. Ecstatically she waited until his mouth slowly neared her own. She knew only one thing: rdoeniadtrgove niardgoverdgovnrdgog. ###### From: Ian Stirling Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: Tue, 20 Feb 2001 21:19:29 GMT Message-ID: <982703969.10024.0.nnrp-13.9e98d142@news.demon.co.uk> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> NNTP-Posting-Host: mauve.demon.co.uk X-NNTP-Posting-Host: mauve.demon.co.uk:158.152.209.66 X-Trace: news.demon.co.uk 982703969 nnrp-13:10024 NO-IDENT mauve.demon.co.uk:158.152.209.66 X-Complaints-To: abuse@demon.net User-Agent: tin/1.5.6-20000803 ("Dust") (UNIX) (Linux/2.4.0-test7 (i686)) Originator: root@mauve.demon.co.uk Lines: 24 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.dplanet.ch!news-ge.switch.ch!news-fra1.dfn.de!news.tele.dk!194.176.220.130!newsfeed.icl.net!dispose.news.demon.net!news.demon.co.uk!demon!mauve.demon.co.uk!root Xref: chonsp.franklin.ch alt.folklore.computers:75957 Bernie Cosell wrote: >Rui Pedro Mendes Salgueiro wrote: >} But there is a problem: the value/byte of Usenet (specially in >} recent years) is quite low (myself, I no longer think it is >} worth enough to use better hardware than a PC with IDE disks *). >} The hardware and bandwidth expense to create a full archive of >} Usenet is probably hard to justify. >Just to provide a data point, some time at the end of last year, a full >usenet feed crossed 200 gigs a day. Yes, gigs, and yes, each day. I think >that if you don't bother archiving any of the binary postings, though, it >is still around only ["only"..:o)] one gig a day which is probably >manageable [in a world with $US300 50 gig hard drives]. I did some basic sums a while back, and it looks like up to april 1995, the whole cumulative volume of usenet is around 5G. Has google released a statement about old archives? -- http://inquisitor.i.am/ | mailto:inquisitor@i.am | Ian Stirling. ---------------------------+-------------------------+-------------------------- Among a mans many good possessions, A good command of speech has no equal. ###### From: jra@dorothy.msas.net (Jay R. Ashworth) Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> Reply-To: jra@baylink.com Message-ID: User-Agent: slrn/0.9.6.2 (Linux) Lines: 15 Date: Wed, 21 Feb 2001 23:20:31 GMT NNTP-Posting-Host: 65.32.104.85 X-Complaints-To: abuse@rr.com X-Trace: typhoon.tampabay.rr.com 982797631 65.32.104.85 (Wed, 21 Feb 2001 18:20:31 EST) NNTP-Posting-Date: Wed, 21 Feb 2001 18:20:31 EST Organization: RoadRunner - TampaBay Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!enews.sgi.com!nntp.primenet.com!nntp.gblx.net!news.maxwell.syr.edu!newsfeed.skycache.com!Cidera!cyclone.tampabay.rr.com!typhoon.tampabay.rr.com.POSTED!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:75977 On 19 Feb 2001 22:16:00 GMT, Michael Wojcik wrote: > But I haven't done even that much. How many people have? Noone has. We knew Deja was doing it. Cheers, -- jra -- Jay R. Ashworth jra@baylink.com Member of the Technical Staff Baylink The Suncoast Freenet The Things I Think Tampa Bay, Florida http://baylink.pitas.com +1 727 804 5015 ###### From: Ian Stirling Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: Thu, 22 Feb 2001 18:39:19 GMT Message-ID: <982867159.20530.0.nnrp-09.9e98d142@news.demon.co.uk> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> NNTP-Posting-Host: mauve.demon.co.uk X-NNTP-Posting-Host: mauve.demon.co.uk:158.152.209.66 X-Trace: news.demon.co.uk 982867159 nnrp-09:20530 NO-IDENT mauve.demon.co.uk:158.152.209.66 X-Complaints-To: abuse@demon.net User-Agent: tin/1.5.6-20000803 ("Dust") (UNIX) (Linux/2.4.0-test7 (i686)) Originator: root@mauve.demon.co.uk Lines: 24 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.tesion.net!news.belwue.de!news-stu1.dfn.de!news-koe1.dfn.de!news-fra1.dfn.de!news.tele.dk!194.176.220.130!newsfeed.icl.net!dispose.news.demon.net!news.demon.co.uk!demon!mauve.demon.co.uk!root Xref: chonsp.franklin.ch alt.folklore.computers:75983 Jay R. Ashworth wrote: >On 19 Feb 2001 22:16:00 GMT, > Michael Wojcik wrote: >> But I haven't done even that much. How many people have? >Noone has. >We knew Deja was doing it. I know I've been archiving 20-30 groups for the past few years. Onto everything from tapes, to floppy, to CD's. (now exclusively CD) If anyone would donate a shell account, with a GB or so of disc, I'd be willing to setup a "old news" archive, focusing on getting a complete archive available via nntp, amongst other things, up to perhaps 95. I'd do this at home, but I've only got a shared 8KB/sec connection that's up only 16 hours/day. -- http://inquisitor.i.am/ | mailto:inquisitor@i.am | Ian Stirling. ---------------------------+-------------------------+-------------------------- Among a mans many good possessions, A good command of speech has no equal. ###### Path: chonsp.franklin.ch!not-for-mail From: Neil Franklin Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: 22 Feb 2001 22:05:51 +0100 Organization: My own Private Self Lines: 48 Message-ID: <6u4rxmwhqo.fsf@chonsp.franklin.ch> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> <982867159.20530.0.nnrp-09.9e98d142@news.demon.co.uk> NNTP-Posting-Host: chonsp.franklin.ch X-Trace: chonsp.franklin.ch 982875951 764 10.0.3.2 (22 Feb 2001 21:05:51 GMT) X-Complaints-To: news@chonsp.franklin.ch NNTP-Posting-Date: 22 Feb 2001 21:05:51 GMT X-Newsreader: Gnus v5.7/Emacs 20.4 Xref: chonsp.franklin.ch alt.folklore.computers:75992 Ian Stirling writes: > Jay R. Ashworth wrote: > >On 19 Feb 2001 22:16:00 GMT, > > Michael Wojcik wrote: > >> But I haven't done even that much. How many people have? > > >Noone has. > > I know I've been archiving 20-30 groups for the past few years. Dito here, all the groups I presently read (6) since when I subscribed to them. a.f.c since Dec 1997. About 1-1.5 years ago someone posted about having all of a.f.c since its beginning, ca 1 CD full in volume. Since then I have been thinking on and off about asking him for a copy and setting up an afc website with an live newsfeed. I just do not have enough time for it. :-( > Onto > everything from tapes, to floppy, to CD's. > (now exclusively CD) Presently just onto 2 HDs (local news server and my backup server). > If anyone would donate a shell account, with a GB or so of disc, > I'd be willing to setup a "old news" archive, focusing on getting a complete > archive available via nntp, amongst other things, up to perhaps 95. Why not up to today, including live feed? As it is just a few groups that should be handleable. > I'd do this at home, but I've only got a shared 8KB/sec connection > that's up only 16 hours/day. I have at home an flaky (down presently) cable modem with dynamic IP and no constantly running computer. At work my employer wont allow such sites (I asked for hosting an user group). Argh! -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ Hacker, Unix Guru, El Eng FH/BSc, Sysadmin, Roleplayer, LARPer, Mystic ###### From: bfd Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Message-ID: <1nsa9t4r2iakic8uu9vvc0oavmedern9to@4ax.com> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <982703969.10024.0.nnrp-13.9e98d142@news.demon.co.uk> X-Newsreader: Forte Agent 1.8/32.548 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 26 Date: Thu, 22 Feb 2001 20:16:50 GMT NNTP-Posting-Host: 24.27.27.170 X-Complaints-To: abuse@rr.com X-Trace: typhoon.austin.rr.com 982873010 24.27.27.170 (Thu, 22 Feb 2001 14:16:50 CST) NNTP-Posting-Date: Thu, 22 Feb 2001 14:16:50 CST Organization: Road Runner - Texas Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!feed2.news.luth.se!luth.se!logbridge.uoregon.edu!pln-w!extra.newsguy.com!lotsanews.com!cyclone.tampabay.rr.com!cyclone.austin.rr.com!cyclone2.austin.rr.com!typhoon.austin.rr.com.POSTED!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:76026 i read in a news release that it was around 1 terabyte. On Tue, 20 Feb 2001 21:19:29 GMT, Ian Stirling wrote: >Bernie Cosell wrote: >>Rui Pedro Mendes Salgueiro wrote: > >>} But there is a problem: the value/byte of Usenet (specially in >>} recent years) is quite low (myself, I no longer think it is >>} worth enough to use better hardware than a PC with IDE disks *). >>} The hardware and bandwidth expense to create a full archive of >>} Usenet is probably hard to justify. > >>Just to provide a data point, some time at the end of last year, a full >>usenet feed crossed 200 gigs a day. Yes, gigs, and yes, each day. I think >>that if you don't bother archiving any of the binary postings, though, it >>is still around only ["only"..:o)] one gig a day which is probably >>manageable [in a world with $US300 50 gig hard drives]. > >I did some basic sums a while back, and it looks like up to april 1995, the >whole cumulative volume of usenet is around 5G. > >Has google released a statement about old archives? ###### From: jfrancis@dungeon.engr.sgi.com (John Francis) Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: 22 Feb 2001 21:28:43 GMT Organization: Silicon Graphics, Inc., Mountain View, CA Lines: 17 Message-ID: <9740ab$61foa$1@fido.engr.sgi.com> References: <982867159.20530.0.nnrp-09.9e98d142@news.demon.co.uk> <6u4rxmwhqo.fsf@chonsp.franklin.ch> NNTP-Posting-Host: dungeon.engr.sgi.com X-Trace: fido.engr.sgi.com 982877323 6340362 130.62.53.248 (22 Feb 2001 21:28:43 GMT) X-Complaints-To: news@fido.engr.sgi.com NNTP-Posting-Date: 22 Feb 2001 21:28:43 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!enews.sgi.com!fido.engr.sgi.com!dungeon.engr.sgi.com!jfrancis Xref: chonsp.franklin.ch alt.folklore.computers:76007 In article <6u4rxmwhqo.fsf@chonsp.franklin.ch>, Neil Franklin wrote: > >About 1-1.5 years ago someone posted about having all of a.f.c since >its beginning, ca 1 CD full in volume. Since then I have been thinking >on and off about asking him for a copy and setting up an afc website >with an live newsfeed. > >I just do not have enough time for it. :-( In theory this would be within the charter of the Computer Museum History Center ( http://www.computerhistory.org ). In practice it won't happen over the next few months, at any rate. But if you've still got the email address of the poster could you ask him to let them have a copy? ###### Path: chonsp.franklin.ch!not-for-mail From: Neil Franklin Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: 23 Feb 2001 21:33:24 +0100 Organization: My own Private Self Lines: 27 Message-ID: <6u3dd5m963.fsf@chonsp.franklin.ch> References: <982867159.20530.0.nnrp-09.9e98d142@news.demon.co.uk> <6u4rxmwhqo.fsf@chonsp.franklin.ch> <9740ab$61foa$1@fido.engr.sgi.com> NNTP-Posting-Host: chonsp.franklin.ch X-Trace: chonsp.franklin.ch 982960404 560 10.0.3.2 (23 Feb 2001 20:33:24 GMT) X-Complaints-To: news@chonsp.franklin.ch NNTP-Posting-Date: 23 Feb 2001 20:33:24 GMT X-Newsreader: Gnus v5.7/Emacs 20.4 Xref: chonsp.franklin.ch alt.folklore.computers:76028 jfrancis@dungeon.engr.sgi.com (John Francis) writes: > In article <6u4rxmwhqo.fsf@chonsp.franklin.ch>, > Neil Franklin wrote: > > > >About 1-1.5 years ago someone posted about having all of a.f.c since > >its beginning, ca 1 CD full in volume. Since then I have been thinking > >on and off about asking him for a copy and setting up an afc website > >with an live newsfeed. > > > >I just do not have enough time for it. :-( > > In theory this would be within the charter of the Computer Museum > History Center ( http://www.computerhistory.org ). > > In practice it won't happen over the next few months, at any rate. > But if you've still got the email address of the poster could you > ask him to let them have a copy? It is buried somewhere in the 76000 [1] a.f.c posts I have here :-(. [1] Dec 1997 - Feb 2001 -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ Hacker, Unix Guru, El Eng FH/BSc, Sysadmin, Roleplayer, LARPer, Mystic ###### From: Ian Stirling Newsgroups: alt.folklore.computers Subject: Re: Google buys Deja Date: Fri, 23 Feb 2001 19:00:49 GMT Message-ID: <982954849.2969.0.nnrp-08.9e98d142@news.demon.co.uk> References: <96c07d01j71@news2.newsguy.com> <7k3l8tk5k5vhbpv6u0a2koom1456drqrm2@4ax.com> <96gf0f$dm8$1@rena.mat.uc.pt> <904CFEF24otakuvidiothotSPAMMY@64.152.100.100> <96s5v00228e@news2.newsguy.com> <982867159.20530.0.nnrp-09.9e98d142@news.demon.co.uk> <6u4rxmwhqo.fsf@chonsp.franklin.ch> NNTP-Posting-Host: mauve.demon.co.uk X-NNTP-Posting-Host: mauve.demon.co.uk:158.152.209.66 X-Trace: news.demon.co.uk 982954849 nnrp-08:2969 NO-IDENT mauve.demon.co.uk:158.152.209.66 X-Complaints-To: abuse@demon.net User-Agent: tin/1.5.6-20000803 ("Dust") (UNIX) (Linux/2.4.0-test7 (i686)) Originator: root@mauve.demon.co.uk Lines: 27 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news-fra1.dfn.de!news-lei1.dfn.de!newsfeed.freenet.de!newsfeed.easynews.net!easynews.net!easynet-melon!easynet-tele!easynet.net!dispose.news.demon.net!news.demon.co.uk!demon!mauve.demon.co.uk!root Xref: chonsp.franklin.ch alt.folklore.computers:76048 Neil Franklin wrote: >Ian Stirling writes: >> Jay R. Ashworth wrote: >> >On 19 Feb 2001 22:16:00 GMT, >> > Michael Wojcik wrote: >> >> But I haven't done even that much. How many people have? >> >> >Noone has. >> >> I know I've been archiving 20-30 groups for the past few years. >> If anyone would donate a shell account, with a GB or so of disc, >> I'd be willing to setup a "old news" archive, focusing on getting a complete >> archive available via nntp, amongst other things, up to perhaps 95. >Why not up to today, including live feed? As it is just a few groups >that should be handleable. I was meaning as an attempt to build as complete an archive of text, from peoples private collections, as they are constantly decaying, be it due to changing interest, bit-rot, or other factors. -- http://inquisitor.i.am/ | mailto:inquisitor@i.am | Ian Stirling. ---------------------------+-------------------------+-------------------------- Windows 2000, software for next millenia. - Ian Stirling.