|
|
Subscribe / Log in / New account

Historical programming-language groups disappearing from Google

As Alex McDonald notes in this support request, Google has recently banned the old Usenet groups comp.lang.forth and comp.lang.lisp from the Google Groups system. "Of specific concern is the archive. These are some of the oldest groups on Usenet, and the depth & breadth of the historical material that has just disappeared from the internet, on two seminal programming languages, is huge and highly damaging. These are the history and collective memories of two communities that are being expunged, and it's not great, since there is no other comprehensive archive after Google's purchase of Dejanews around 20 years ago." Perhaps Google can be convinced to restore the content, but it also seems that some of this material could benefit from a more stable archive.

(Log in to post comments)

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 15:23 UTC (Tue) by auc (subscriber, #45914) [Link]

I used to read comp.lang.lisp for a couple years around 2004 and it was an amazing experience. The complete disappearence of such archives would be a great loss.

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 16:44 UTC (Tue) by nix (subscriber, #2304) [Link]

This looks like ridiculous overreaction from some braindead automation. I thought Google was supposed to have the best AI in the world, but whenever they let that AI take any decisions that affect real people it seems to make disastrous mistakes, usually with no obvious human checking before disaster is inflicted and no recourse other than signal-boosting via the press, since Google always does this with no humans in the loop and no appeals procedure (not that it's clear who could be consulted in the case of historical archives!). This is... not a sensible way to work.

Deleting historical archives because of spam is even less sensible: it's not like more spam is going to materialize in the past history of comp.lang.lisp.

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 17:51 UTC (Tue) by farnz (subscriber, #17727) [Link]

I suspect that it's happening because Google Groups is two products merged together:

  1. The old "dejanews" Usenet archives, which I'll call "Usenet" throughout this comment.
  2. Google's in-house forum system, which I'll call "forum" throughout this comment.

It makes sense to remove entire forum groups which have always had a spam problem, and where the forum group owner isn't willing to use Google's tools to moderate it and keep it spam-free; after all, if you've created a forum group for the purpose of spamming, or if you simply gave up the moment spammers found you, there's probably not much non-spam in the group. This is doubly true since the tools have been there since the forum group was created, and advertised to you as the forum group creator; AIUI, Google has reached out to their owner of record for such forum groups and asked them to clean up, so anything left is something that nobody still cares about.

However, that analysis ignores Usenet. Usenet predates Google's spam handling tools (after all, it predates Google), and has never had good tools for dealing with spam problems. Further, because there's no creator or owner on Google's systems for any given Usenet group, there's no-one to reach out to, so there's no-one who can (e.g.) close the group to new posts and clean up history, like there is for forums. Thus, unlike with forum groups, Google has no way to contact someone and say "hey, this group is spammy, please fix".

All it takes is someone designing an AI setup to clean out forum groups that are zero signal, and then running it on both Usenet and forum groups to get into this situation; chances are high that nobody involved in this decision has even realised that the two things are different, because they've been merged together a long time ago.

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 1:38 UTC (Thu) by Max.Hyre (subscriber, #1054) [Link]

> Usenet predates Google's spam handling tools

Usenet predates spam. There are still a few fogies (read: me) who remember the first spam. :-(

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 16:47 UTC (Thu) by littoral (guest, #140523) [Link]

I remember it too, and it was from lawyers, IIRC. The same ad was on dozens of newsgroups, offering their services to help people immigrate to the US. Reminds me of an old joke that isn't really funny any more:
Man walks into bar and says loudly to the bartender: "Lawyers are assholes!".
Another man, sitting at the same bar, overhears the comment, turns round and says: "Hey, I resent that!".
The bartender, trying to calm things down, turns to the second man and asks politely, "Are you a lawyer, sir?"
The annoyed reply is:
"No!! I'm an asshole."

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 22:45 UTC (Thu) by jschrod (subscriber, #1646) [Link]

I was tempted to write "me too" - but then I remembered the first time these AOLers showed up on our Usenet and started to pull down the signal-to-noise ratio significantly. For quite some time, I had "@aol.com" in my kill file.

IIRC, this was even before the first Spam posts by Serdir Argic et all. IMHO, Canter/Siegel are wrongly acknowledged to have sent the first spam on Usenet. I remember clearly that the Argic bot was more of a nuisance.

But, to my pleasure, with a good Usenet provider, it's back to usable today. I'm in Germany and use Individual.net and I'm happy with it. Since almost all lusers are now on Twitter, Instagram, or Facebook, it's almost back to the experience of the late 80s or early 90s - an exchange of geeks.

Historical programming-language groups disappearing from Google

Posted Jul 31, 2020 19:49 UTC (Fri) by kmweber (guest, #114635) [Link]

Depends on how you define "spam" :)

Finn Brunton's *Spam: A Shadow History of the Internet* is a really good sociocultural history of spam that analyzes the interplay between spammers, spamfighters, and online communities writ large as both a sociological and a technological phenomenon.

Historical programming-language groups disappearing from Google

Posted Jul 31, 2020 11:04 UTC (Fri) by flussence (subscriber, #85566) [Link]

Google doesn't have the best AI, only the most profitable. Their business model is “unfortunate externalities” all the way down, and has been pretty much since they bought DoubleClick.

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 17:11 UTC (Tue) by craigmaloney (guest, #117695) [Link]

This looks like it spans other groups as well. comp.sys.sinclair also has been "banned".

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 13:54 UTC (Thu) by ecree (guest, #95790) [Link]

Well, at least that gives me an idea for a CGC entry... ;)

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 19:30 UTC (Tue) by Lennie (subscriber, #49641) [Link]

This sounds like something the Internet Archive would be a good fit for.

(I hope despite their current legal issues means the data they have is still safe and remains so)

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 9:30 UTC (Wed) by t-v (guest, #112111) [Link]

https://archive.org/details/usenet-comp.lang

but I would not know how complete that is.

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 21:20 UTC (Thu) by azz (subscriber, #371) [Link]

It's not very complete, having used this for computing history research on several occasions. The Internet Archive's coverage of Usenet is nearly perfect up to about 1990 thanks to Henry Spencer's utzoo collection, then there's nearly nothing from the 1990s (aside from a few groups that were gated elsewhere or released on CD-ROM), then there's fairly good coverage of the mid-2000s to the present based on a couple of different partial archives. It's a great pity that there's no public 1990s archive in raw form.

Historical programming-language groups disappearing from Google

Posted Aug 2, 2020 16:53 UTC (Sun) by neozeed (guest, #140575) [Link]

Better check the UTZOO archive again, it's been destroyed from archive.org:

In 2020 after sustained legal demands requesting a set of messages within the Usenet Archive be redacted, and to avoid further costs and accusations of manipulation should those demands be met, the archive has been removed from this URL and is not currently accessible to the public.

Historical programming-language groups disappearing from Google

Posted Aug 7, 2020 9:55 UTC (Fri) by Lennie (subscriber, #49641) [Link]

That's really sad.

It's really worrying how much legal stuff IA have to deal with.

Because as digital as we are and how more and more data is encrypted or stored in certain fileformats that depends on certain software being available these times might end up as dark ages unless we have organizations like the Internet Archive to archive and make information available in ways that allows people to access it many years later.

Historical programming-language groups disappearing from Google

Posted Aug 3, 2020 1:07 UTC (Mon) by sub2LWN (subscriber, #134200) [Link]

There is a different collection which appears more complete of the 90s content:

https://archive.org/details/usenethistorical

"This historical collection of Usenet spans more than 30 years and was given to us by a generous donor." OP's link went to a dump someone made from Giganews. One of the mbox files I checked appears to have "X-Google-Thread" headers on each post, so maybe it was extracted from Google Groups somehow. In the usenethistorical collection, "usenet-comp" has 1213 zip files. Comp.os.linux's unzipped mbox is 252 MB and, by a simple grep, contains 11863 posts from 1992 and 26208 posts from 1993. There are only 1008 in 1994, however: this could also be incomplete. It seems unnatural that posts would taper off so quickly, unless 1994 coincides with a move toward IRC or LKML??

As for the groups du jour: Comp.lang.forth and comp.lang.lisp are also in "usenet-comp" as 85 MB and 106 MB zip files, respectively. Archive long and prosper!

Historical programming-language groups disappearing from Google

Posted Aug 4, 2020 23:58 UTC (Tue) by KaiRo (subscriber, #1987) [Link]

I was about to wonder what would be if there was some non-profit that would just archive historic things like that if we would give it to them, say, something called "archive.org", for example. Oh. What's that? There is something there...

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 19:58 UTC (Tue) by beshr (guest, #133204) [Link]

I don't understand how "banning" them would help with the spam problem, if that's what it is. Can they be convinced to at least provide dumps to archive.org?

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 20:00 UTC (Tue) by readv_ (guest, #140452) [Link]

Going to go out on a limb here and assume that this is associated with any large count of spam posters leaving nefarious links across multiple groups over the past 10 years.

This is also a good way for the 'goog' to roll this up and clean out their storage and archive a lot of the data, leaving us in a lurch if we're searching for anything older than what they will allow us to view. (Sans Int.Archives)

Besides, Stack-Overflow and Reddit have become de-facto for most people. Be weary though, SO will soon shred that apart if they go enterprise.

/me starts thinking of an archive strategy.

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 21:43 UTC (Tue) by leromarinvit (subscriber, #56850) [Link]

> This is also a good way for the 'goog' to roll this up and clean out their storage and archive a lot of the data, leaving us in a lurch if we're searching for anything older than what they will allow us to view. (Sans Int.Archives)

Compared to the gazillion of YouTube videos, their entire Usenet archive must be peanuts, so I doubt that's a concern. Or if it is, it doesn't seem like a very sensible one.

I guess it's only a few dozen terabytes compressed (but I can't find any stats excluding binaries). Presumably they don't archive the binary groups anyway, for various obvious reasons.

Historical programming-language groups disappearing from Google

Posted Jul 28, 2020 23:26 UTC (Tue) by readv_ (guest, #140452) [Link]

Send Sundar an email?

Historical programming-language groups disappearing from Google

Posted Aug 1, 2020 20:59 UTC (Sat) by ssmith32 (subscriber, #72404) [Link]

SO already offers enterprise accounts, correct?
Also, SO is a blessing and a curse:
- oh, that's a neat approach, let me test it out and do a little research (good outcome)
- hey, why did you sent your initialization vector to a constant?
> That's what you're supposed to do.
>> Hmm.. pretty sure it's supposed to be random
>>> No, look at this SO article
>>>> Hmm . I'm not great at algebra, but that looks wrong.
>>>>>> Oh, you're right, well they had formulas, so I figured their code must be right.

(Or something similar, it's been a few years, but I'm pretty sure it was AES, and the "well they had formulas")

they're both still in use

Posted Jul 28, 2020 22:40 UTC (Tue) by gus3 (guest, #61103) [Link]

Forth is definitely still in active use, every time a *BSD or OpenIndiana system boots.

And of course, the gajillions of Lisp dialects, including Emacs Lisp.

I sure hope this situation is a machine screw-up and not a conscious human decision.

they're both still in use

Posted Jul 29, 2020 8:39 UTC (Wed) by Wol (subscriber, #4433) [Link]

Isn't at least one major BIOS written in Forth?

Cheers,
Wol

Open Firmware uses Forth

Posted Jul 29, 2020 12:04 UTC (Wed) by dkg (subscriber, #55359) [Link]

yep. Open Firmware (aka IEEE 1275), which boots all the old powerpc macintosh machines, sun SPARC, OLPC, and more, is a Forth interpreter.

Open Firmware uses Forth

Posted Jul 29, 2020 13:05 UTC (Wed) by Wol (subscriber, #4433) [Link]

Actually, I was talking about AMI, or Phoenix, or one of that lot, ie one of the x86 BIOSes.

Cheers,
Wol

Open Firmware uses Forth

Posted Jul 29, 2020 15:46 UTC (Wed) by nix (subscriber, #2304) [Link]

Most unlikely. x86 pre-EFI BIOSes are all repeatedly-hacked horrors from the early 80s, as I understand it, and back then it was raw assembler or nothing.

Open Firmware uses Forth

Posted Jul 29, 2020 23:16 UTC (Wed) by Wol (subscriber, #4433) [Link]

I've lost my copy of Starting Forth, sadly (I used to own a Jupiter Ace), but I'm pretty certain Forth dates from the 70s. And I would have thought writing a BIOS in it would be both very efficient, and (relatively) easy. It lends itself easily to writing assembler primitives, and then using a higher level construct to link them together.

Forth had the reputation of creating executables that beat assembler for compactness ...

Cheers,
Wol

Open Firmware uses Forth

Posted Aug 2, 2020 13:27 UTC (Sun) by guv (guest, #140573) [Link]

If your programs end up compiled into lists of (back when, 16-bit) pointers to other such lists, eventually ending up pointing to primitives, that beats the assembly alternative of lists of call instructions. If the Forth system generates (inline) assembly instructions instead of issuing (implicit) calls, code size will be more comparable to what you usually see from compiled languages.

It is telling that OFW originated at Sun when it was still an engineering company and got used by the engineering folks at Apple. It is also telling that Forth used to be used as a boot stage in FreeBSD, where it has now been replaced by lua. Being effective with Forth takes a certain mindset that not many have.

You could try and drop coreboot on some suitable hardware and see if there isn't an OFW-payload available.

Starting Forth is available free in electronic form on forth.com.

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 0:16 UTC (Wed) by atai (subscriber, #10977) [Link]

Google, managing the world's information

old information too?

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 2:28 UTC (Wed) by connert (guest, #140463) [Link]

I've moved the original support request to the correct community, hopefully I can also get some more information on this.

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 5:17 UTC (Wed) by bokr (subscriber, #58369) [Link]

What if UNESCO declared usenet to be a World Heritage legacy?

IMO Libraries should not be burned helter skelter, even if mistakenly sold into
unrestricted private ownership.

But why would Google burn usenet rather than offer it for sale,
if they want to get rid of it?

Why wouldn't they recognize the great goodwill value for themselves
of just transferring the archives to the FSF (Free Software Foundation)?

From Wikipedia:
_______________________________________________________________________________________________________________

UNESCO has 193 member states and 11 associate members.[5] Based in Paris, France, most of its field offices are "cluster" offices that cover three or more countries; national and regional offices also exist.

UNESCO seeks to build a culture of peace and inclusive knowledge societies through information and communication.[6] To that end, it pursues its objectives through five major program areas: education, natural sciences, social/human sciences, culture and communication/information. It sponsors projects related to literacy, technical training, education, the advancement of science, promoting independent media and freedom of the press, preserving regional and cultural history, and promoting cultural diversity. UNESCO assists in translating and disseminating world literature, establishing international cooperation agreements to secure "World Heritage Sites" of cultural and natural importance, preserving human rights, and bridging the worldwide digital divide. It also launched and leads the Education For All movement and lifelong learning.

UNESCO world heritage?

Posted Jul 30, 2020 8:45 UTC (Thu) by pixelpapst (guest, #55301) [Link]

Getting a thing in the running for being a recognized UNESCO world heritage is, as one would expect, a bureaucratic process spanning multiple nations. But is can actually be initiated by mere mortals.

I've been following a bit the effort to get the Demoscene recognized as such. Here's a nice introductory talk: https://www.youtube.com/watch?v=k3Su7DA8moE

In case you (or anybody else) decides to tackle this for Usenet, please drop me a line.

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 12:00 UTC (Wed) by gray_-_wolf (subscriber, #131074) [Link]

The link gives `Sorry, this page can't be found.`, so it was deleted? Or is it not supposed to be readable by public?

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 15:29 UTC (Wed) by sumanah (guest, #59891) [Link]

I'm having the same issue.

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 15:39 UTC (Wed) by corbet (editor, #1) [Link]

Weird...it works for me still...

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 15:47 UTC (Wed) by jake (editor, #205) [Link]

> I'm having the same issue.

Me three, i thought it might be related to this comment: https://lwn.net/Articles/827293/ but dunno ...

jake

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 9:29 UTC (Thu) by knewt (subscriber, #32124) [Link]

I believe so, yes. It did redirect once for me, but not again. Luckily I looked at the new url. It's now available at https://support.google.com/groups/thread/61391913

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 13:32 UTC (Wed) by dtnameh (guest, #140476) [Link]

would it be possible to move them to archive.org

Historical programming-language groups disappearing from Google

Posted Jul 29, 2020 15:58 UTC (Wed) by oldtomas (guest, #72579) [Link]

The decline of the Library of Alexandria [1].

[1] https://en.wikipedia.org/wiki/Library_of_Alexandria

Historical programming-language groups disappearing from Google

Posted Oct 13, 2020 15:33 UTC (Tue) by immibis (subscriber, #105511) [Link]

Yes, but instead of a fire, it was bulldozed to build a shopping mall to compete with the one across the street.

Historical programming-language groups disappearing from Google

Posted Jul 30, 2020 7:45 UTC (Thu) by bokr (subscriber, #58369) [Link]

Has anyone (EFF?) considered the aspect of destroying evidence
of prior art in the public domain?

There were a lot of brilliant people discussing ideas pretty freely.

How many patents exist making exclusive claims on ideas
discussed openly in the archives, I wonder.

And how many such patents are pending now?


Copyright © 2020, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds