Re: googling google persistance using google api
<hawat.thufir@gmail.com> wrote in message
news:1149550133.342150.221200@f6g2000cwb.googlegroups.com...
I've been over to <http://www.google.com/apis/> and downloaded
GoogleAPIDemo.java, which is much simpler than expected. I'd like to
save the query results to a file. I'm thinking that each query result
could almost be apended to an XML file, this seems the most natural and
easiest approach.
So far so good...
However, I'm, err, not finding anything on google about doing so. I
know that the demo uses SOAP and I know that I know nothing about SOAP
and have no real interest in learning about SOAP at this time.
However, I believe that SOAP is good for communicating with databases,
yes?
I don't think so. I'm not an expert on SOAP, but my understanding is that
it's just passing XML documents around as messages, primarily using the HTTP
protocol. So normally a HTTP request looks *vaguely* something like this:
HTTP 1.1
I'm the FireFox browser.
GET /index.html
And so to use SOAP, you'd form a request that looks vaguely something like
this:
HTTP 1.1
I'm the fireFox browser
PUT (the length of the following message)
<execute methodName="foo">
<param name="bar" type="int" value="42"/>
</execute>
Or something like that, and the web server responds with another XML
document, probably something along the lines of
<result methodName="foo">
<returnValue type="float" value="3.1415"/>
</result>
In other words, it's your basic remote procedure calls using XML as the
serialization mechanism.
I've been over to <http://www.w3.org/TR/soap/> which led me to:
Abstract
SOAP is a lightweight protocol for exchange of information in a
decentralized, distributed environment. It is an XML based protocol
that consists of three parts: an envelope that defines a framework for
describing what is in a message and how to process it, a set of
encoding rules for expressing instances of application-defined
datatypes, and a convention for representing remote procedure calls and
responses. SOAP can potentially be used in combination with a variety
of other protocols; however, the only bindings defined in this document
describe how to use SOAP in combination with HTTP and HTTP Extension
Framework.
Right, hopefully with the examples above, this definition is a bit more
clear.
which all sounds very nice but it seems like I'd have to buy a book to
get it working. There's lightweight, then there's lightweight, if you
get my meaning, and I think that SOAP is bit not-lightweight for my
purposes. I would like an XML file with something like (please forgive
my atrocious XML):
<result>
<www.whatever.com />
<some text here />
</result>
<result>
<www.foo.com />
<different text here />
</result>
<result>
<www.bar.com />
<text on bar here />
<result>
That is rather atrocious. How about something like:
<ResultSet>
<Result url="http://www.whatever.com">some text here</Result>
<Result url="http://www.foo.com">different text here</Result>
<Result url="http://www.bar.com">text on bar here</Result>
</ResultSet>
Or something along those lines, which mirrors whatever google gives the
demo. However, I can't find something ready-made for creating such a
file, which surprises me. I'm looking for something relatively easy
which I can put together with the demo to save some results to file,
such that the results can be put into a database at a later time.
However, my google results are less than stellar. How do people
save/persist their google results?
To emit XML documents to disk, you could built up the document tree in
memory using JDOM or some similar API, and then serialize it. However, I
find it much simpler just to use a bunch of println statements on a
FileWriter. E.g.
println("<Result url=\"" + url + "\">" + textContent + "</Result>");
My personal computer is broken at the moment, so this all hypothetical.
I haven't compiled the demo yet, never mind modifying it.
Note that you might need to get a license key from Google to use its
querying IP, and there's a limit of 1000 queries per day.
- Oliver