My site statistics are now available as a web service. Example:
>>> import xmlrpclib
>>> url = ‘http://diveintomark.org/cgi-bin/webservices.cgI’
>>> server = xmlrpclib.Server(url)
>>> server.siteStatistics.getHitCount()
2173
>>> server.siteStatistics.getAllVisitorCountByOS()
{’WinXP’: 48, ‘Mac OS’: 121, ‘Mac OS X’: 10, ‘Win98′: 120, ‘Linux’: 35, ‘Win95′: 24, ‘Win2000′: 143, ‘WinNT’: 247}
You can see which services are available with the system.listMethods method:
>>> import pprint
>>> pprint.pprint(server.system.listMethods())
{’examples.getStateName(stateIndex)’: ‘None’,
’siteStatistics.getAllReferers()’: ‘get list of (count, domain, url) for each referer’,
’siteStatistics.getAllSearches()’: ‘get (count, url, searchstring) for all Google searches’,
’siteStatistics.getAllVisitorCountByBrowser()’: ‘get dictionary of {browser : visitor count}’,
’siteStatistics.getAllVisitorCountByOS()’: ‘get dictionary of {operating system : visitor count}’,
’siteStatistics.getHitCount()’: ‘get total number of page hits’,
’siteStatistics.getRefererCount()’: ‘get total number of unique referers (based on domain name)’,
’siteStatistics.getReferersByDomain(domainname)’: ‘get (count, url) for a single referer’,
’siteStatistics.getVisitorCount()’: ‘get total number of unique visitors (based on IP address)’,
’siteStatistics.getVisitorCountByBrowser(browsername)’: “get number of visitors using a specific browser”,
’siteStatistics.getVisitorCountByOS(osname)’: “get number of visitors using a specific operating system”,
’system.listMethods()’: ‘None’}
As you may have noticed from the URL passed to xmlrpclib.Server, this is all done through CGI. This means that it can work on any platform that can run Python, and run CGIs. (If you’re not using Python 2.2 or later, you’ll need to download the XML-RPC library and throw it in your cgi-bin directory.)
The CGI script (webservices.cgi, open source, Python license) just acts as a dispatcher, taking the XML-RPC request and passing it off to the appropriate handler function. Handlers are defined in separate .py scripts in a specific directory (WEBSERVICESDIR, hardcoded at the top of the CGI script). This means that new web services can be deployed by simply copying new .py scripts into this directory. Examples:
- siteStatistics.py - implements functions for getting site statistics. The data comes from a file generated by the same script (sitestats.py, open source, GPL license) that generates the HTML version. People like pretty pictures; computers like data. (Note: if you want to implement sitestats.py on your own server, you’ll need template.html and referers.css as well.)
- examples.py - implements the canonical XML-RPC example, getStateName.
- system.py - implements the listMethods function, which dynamically determines which web services are available by introspecting into all the .py scripts in the WEBSERVICESDIR directory. No need to register new services manually; just copy the .py file into the directory, and both the CGI wrapper and the listMethods function will find it immediately.
This design was inspired by Dave Winer’s observation that Microsoft’s framework for web services required too much overhead for developers. Look at examples.py again: there is no absolutely overhead involved in making getStateName into a web service. The CGI wrapper handles all the marshalling and unmarshalling of data over the wire, and the service-specific .py script just handles business logic.
>>> server.examples.getStateName(41)
’south Dakota’
So is this a competitor to .NET-based web services? No. It’s CGI, so there are scalability considerations. The dispatcher does automatic discovery of available web services every time you call it, so there are performance considerations. There’s no authentication, no encryption, no logging. On the plus side, it’s lightweight, open source, cross-platform, and doesn’t require any special server-side privileges beyond the ability to install your own CGI scripts.

