[PREV - PATTERN_ACTION] [TOP]
BACK_SCRATCH
May 16, 2023
From Steven Levy's "In the Plex" (2011), p. 17:
"... But it wasn't at all obvious what
linked *to* a page. To find that out,
you'd have to somehow collect a database
of links that connected to some other
page. Then you'd go *backward*."
"That's why Page called his system BackRub.
'The early version of hypertext had a tragic
flaw: you couldn't follow links in the other Nelson's Xanadu project
direction,' Page once told a reporter. has been written out of
'BackRub was about reversing that.'" this history of "early
versions" of hypertext.
p.16:
"Having a human being determine the ratings
was out of the question. First, it was
inherently impractical. Further, humans But it's humans all the
were unreliable. Only algorithms-- well way down-- as presented
drawn, efficiently executed, and based on here, this is a quest for
sound data-- could deliver unbiased results. something like an
So the problem became finding the right data objective view of the web
to determine whose comments were more of information, but it
trustworthy, or interesting than others." can't be anything but an
"unbiased" rendering of
human biases.
Missing from this story is
yahoo, which was originally
in the business of
presenting a human-curated
collection of links
"yet another hierarchically
organized o--"
Ontology?
This is indeed difficult--
certainly yahoo had difficulty
maintaining quality over time--
but not exactly undoable, at
least not in the early days of
the web.
I have a theory that as web
consolidation has progresed and the
intelligence level of the average
contribution continues to be
diluted, that we're back in a
regime where it might be a winning
strategy to do collections of human
curated links.
To get *serious* material on a
subject, you can probably write
down a list of a few dozen sites to
start. The collections of material
in some of those places may well be
massive, but it's bound to be more
tractable than spidering the entire
web, and more to the point, the
material is going to be *already
indexed* in many cases. Collating
search results from the search
features at individual sites would
get you a lot of the way there.
Consider:
Google Scholar
Blekko
And there's a need
for an end-run
around wikipedia's
"nofollow" policy.
p. 17:
"Page, a child of academia, understood that
web links were like citations in a scholarly
article. It was widely recognized that you
could identify which papers were really
important without reading them-- simply tally
up how many other papers cited them in notes
and bibliographies."
You can use this to identify "importance",
but there isn't any way to use to this to
find unfairly ignored high-quality work.
And people writing academic papers are
well aware that they have to be
careful to cite predecessors *when
they're already regarded as SUPERCONDUCTING_STATE
important*. These are people who may
well be critical for your own career,
you don't want to offend them.
Citation indexing is a guide to quality
that relies on the intellectual integrity
of the human beings publishing the
research...
And this problem is even worse for the
web, where identities remain slippery,
and motivations are often even more
corrupt-- political operatives and
commercial shills abound, in addition
to outright crazies, and the pathetic
sabotage efforts of trolls.
ENGINE_TROUBLE
The idea that you can *automatically*
navigate this chaff to find the And there's a bad
true gold looks increasingly fool-hardy. problem that follows
from this one: what
*point* is there in
creating a new work
that you know will be
effectively invisible?
--------
[NEXT - SNAKE_SCRATCH]