hi samizdat-devel,
SQUISH QUERIES - GREAT IDEA + COULD THEY OVERLOAD?
i just realised something which we don't seem to have
"advertised" much
regarding samizdat vs other indymedia candidate cms'es, and
it's something
which i more or less forgot about.
If i understand correctly, using e.g. the "All
Replies" link in Links
or the squish enquiries, any anonymous user can make any
enquiry directly
into the postgresql database with the authorization of the
apache user.
i didn't think much about it, because it's clearly not a
priority
for "non-techies".
However, IIUC, this is definitely an extremely good thing in
terms of
scalability, non-hierarchy and distribution of information.
It means,
for example, that each local imc could have its own version
of the imc
contact database (all the public parts) and then a web of
trust could
grow more organically, with the help of robots doing the
administrative
side while people do the fun stuff.
Naive question: On the other hand, couldn't this be a
problem in terms
of attacks on the system? In trying to get mir to do
something
slightly creative in order to provide parallel solutions to
people who
had different views on publishing priorities, at some point
our
"obvious" hack to a "template" (mir
terminology) created a job which
would take 10-20 minutes to run each time somebody published
a new
article. As long as no more than a few articles per hour
were
published (and 48 articles a day would already be a lot)
this was
annoying but not critically bad. It turned out that the
problem was
related to N^2 searching through the database. This was not
an
attempt to attack the system (on the contrary!), but it
generated a
lot of excess load for the CPU.
People probably have already thought through this, and i
have started
reading through query.rb, and maybe the answers lie there,
but my question
remains (at least for the moment): are there some mechanisms
in place to
make sure that robots (with good intentions) or spambots
(with bad
intentions) cannot overload the cpu+database?
i can see one element of limiting this:
# Size Limits
#
limit:
pattern: 7 # maximum size of search query pattern
i seem to remember that in postgresql there are some
functions which
evaluate the "cost" of an enquiry. At the risk of
adding extra work
(time delay) for all ordinary enquiries, it should
presumably be
possible to first check the "cost" of the enquiry,
and if the "cost"
would be too high (i.e. the enquiry would take too long),
then refuse
to do it and take some sensible "rescue" action -
e.g. inform the user
and suggest that s/he design a more efficient or less
demanding
enquiry.
Any thoughts?
cheers
boud
_______________________________________________
samizdat-devel mailing list
samizdat-devel nongnu.org
http://lists.nongnu.org/mailman/listinfo/samizdat-devel
a>
|