Suppose I want to get the index.html from finance.yahoo.com. Following
the guidance from these sites
http://www.jmarshall.com/easy/http/
http://web-sniffer.net/
I learned that the following should work:
Printf.fprintf
outchan
"GET / %s"
"HTTP/1.1rnHost: finance.yahoo.comrnConnection: closernAccept:
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5rnAccept-Language:
en-us,en;q=0.5rnAccept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.7rnUser-Agent: Mozilla/5.0 (Windows; U;
Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0
Web-Sniffer/1.0.24rnrn";
flush outchan;
However this always gives me an error 400 - ie yahoo does not have
that resource. I've also tried reducing the get request to "GET /
HTTP/1.1rnrn". Any hints?
.