dive into mark

You are here: dive into markArchivesSeptember 2003Microsoft web services, brought to you by the letter L and the number 0

Tuesday, September 2, 2003

Microsoft web services, brought to you by the letter L and the number 0

Microsoft Web Services 1.0. The Microsoft.com Web Service is an XML Web service that will enable you to integrate information and services from MSDN, Technet, other Microsoft.com sites, and Microsoft Support. As a way to test the new architecture, we present version 1.0 of the Microsoft.com Web Service, which allows you to integrate information about the Microsoft.com top downloads. Future releases will build on this architecture to provide access to a broader variety of Microsoft content and services.

According to the page, you need to sign up for a developer token (similar to a Google API key). The developer’s token and password are used to authenticate you to the Web service using a WS-Security SOAP header. No further information is available, because the documentation for the web service is wrapped in a Windows installer which will only install over Visual Studio .NET.

I’m going to repeat that, in case you missed it: the documentation for the web service is wrapped in a Windows installer which will only install over Visual Studio .NET.

Luckily for me, my next project for my day job involves VS.NET, so I happen to have installed it recently. Unluckily, the application form to get a developer token is totally broken. First, it’s an SSL site with an apparently invalid security certificate. Luckily for me, I have Internet Explorer, which helpfully ignores this sort of problem. Second, it requires me to read the random series of numbers and letters out of an image (a technique which is completely inaccessible, and for which no audio-based fallback technique is provided). Luckily for me, I have a working pair of eyes. Unluckily, my image included the lowercase letter “L” and the number “0″. Needless to say, there are 4 combinations (”L”, “1″, “o”, “0″). Luckily for me, I got it right on the fourth try. Unluckily, once I got it right, I got a new error message saying Internal Service Error, please try again after sometime.

Presumably these first-day jitters will eventually be ironed out and I will be able to write the inevitable Python wrapper for the exciting new Microsoft Web Services so that you can, from any platform and without paying $1000 for Microsoft developer tools, be able to programmatically determine the top Microsoft downloads of the day. I can hardly wait.

Filed under ,

30 comments

  1. There’s an interesting side point in there, the other day I was using a site that required the “type the text you see in this goofy image” authentication method, and I was wondering about how sight impaired users could deal with that. You can’t use alt text on the image, because that would give the game away. Is there a way to make this technique accessible?

    Comment by Gordon Weakliem — Tuesday, September 2, 2003 @ 2:19 pm

  2. Considering MSDN has one of the worst navigation schemes I’ve ever tried to dig through (not to mention a terrible search engine as well), the sooner the aforementioned Python web service wrapper arrives the better. It does sound like they’ve excelled themselves with the “embrace and extend” philosophy on this one though.

    Comment by Simon Willison — Tuesday, September 2, 2003 @ 2:27 pm

  3. I’m suspecting this is a combination of only supporting their own products and only supporting their recent products. I’m fairly convinced Visual Studio 6.0 isn’t being actively supported anymore, and pushing out sample code for this would be them violating their own policy. Similarly, you don’t see PHP source code along with ASP+VBScript source code in MSDN.

    Comment by Jesper — Tuesday, September 2, 2003 @ 2:35 pm

  4. At ~$1000 for a license, it’s more like “embling and extend”.

    </got_nothing>

    Comment by Ethan — Tuesday, September 2, 2003 @ 2:35 pm

  5. re: “is there a way to make this technique accessible”? No, but you can provide a more accessible fallback technique, such as “listen to this recording and type what you hear”, which only discriminates against the deaf-blind. And then provide an alternative fallback technique, such as “call us up and we’ll give you your developer token”. Naturally, companies are hesitant to do that, since it requires additional non-web infrastructure and opens the door to massive additional expense and abuse. Absolute accessibility is expensive, just like absolute security. You have to decide how much you care, at what level.

    In this case, I’m not sure why they’re doing it at all. Spambots want to sign up for multiple developer tokens? Um, why? You can use a single token 10 times a second. Developers want to write and deploy high-volume web services applications that rotate through auto-generated developer keys to integrate Microsoft web services and serve hundreds of customers per second? The potential problems here seem vanishingly small and certainly handle-able later, rather than forcing everyone to go through all this up-front rigamarole.

    Comment by Mark — Tuesday, September 2, 2003 @ 2:36 pm

  6. “Can this be made accessible?”

    I expect that an OCR program, hooked up to a screen-reader might work. :-)

    This is an interesting variant on the Turing Test.
    Can one distinguish a human respondent from a machine? Pretty easily, if a human is administering the test. But what if you want a *machine* to administer the test? Can a *machine* distinguish a human respondent from a machine?

    I suspect that, in general, the answer is “no.” Which makes this exercise stupid, in addition to being an impediment to accessibility.

    Comment by Jacques Distler — Tuesday, September 2, 2003 @ 2:39 pm

  7. Mark,
    I seem to remember being in an email thread with some of the folks involved with implementing this. I’ll try to dig it up and forward your post to the appropriate parties.

    Comment by Dare Obasanjo — Tuesday, September 2, 2003 @ 2:44 pm

  8. Another interesting point: the restrictions on redistribution of sample code. Specifically, if you want to redistribute original or modified included sample code, it can only be with a package that only runs on Windows. As a shareholder and Microsoft employee, this makes sense to me; as a cross-platform hobbyist developer, I’m less convinced.

    Comment by Tim Jarrett — Tuesday, September 2, 2003 @ 4:37 pm

  9. You were being sarcastic in this article, right?

    *big honking grin and sly eyes slowly appear on face while head slightly turns to cough falsely*

    I’ll be returning to read about what cool things you may be able to do with this service, cause it’s much better learning things second hand from really smart people who get frustrated by “crap that sort-of smart people just give up on in 5 minutes”. You’ve saved us many headaches by sharing yours and handing out your version of sarcastic asprin. I love it. Thanks Mark.

    Comment by Jai — Tuesday, September 2, 2003 @ 5:02 pm

  10. What, you want it to work through something other than an MS product? And distribute it for free?

    You’re one of those commies, right?

    Comment by kami — Tuesday, September 2, 2003 @ 5:07 pm

  11. re: “is there a way to make this technique accessible”? (1)

    and re: “This is an interesting variant on the Turing Test.” (6)

    Why not ask the “user” (human, spambot, or otherwise) a question that AI can’t solve as of yet?

    Remember those cognition tests in grade school? The ones that presented four images and asked which doesn’t belong? You can use four images with four alt tags, and a computer shouldn’t be able to figure that one out.

    Or, a “finish this sentence” type of question…Present the user with four words to finish a very common-sense type sentence. Create a few dozen of these and randomize them, and I’d imagine that a spambot wouldn’t be able to break through.

    It seems to me that there are perfectly accessible alternatives to this letters-inside-of-graphics thing, so long as you spend a minute or two thinking about it.

    Comment by Ken Walker — Tuesday, September 2, 2003 @ 6:07 pm

  12. Without wanting to get into a big flamewar, the “read this graphic and type the letters” accessibility issue is miniscule compared to the “read this english graphic and type the english letters” i18n issue.

    By making these characters English-only, you’re requiring that a user has a keyboard that contains western characters. Keycaps or no, that is a complete PITA for non-western users.

    Rather than try to authenticate by viewing a graphic or listening to a song, or typing the answer to an English question (how nonsensical is that?) the site should authenticate based upon an end-user certificate. Eg, if you don’t have a personal certificate, you don’t get the goods.

    Comment by Mark2 — Tuesday, September 2, 2003 @ 6:31 pm

  13. End-user certificate is a non-starter. never.

    I think it’s time to un-hook computers from the .net

    Comment by wp — Tuesday, September 2, 2003 @ 7:12 pm

  14. Tim - I take it that it’s OK to write sample code for other platforms, as long as it’s not derived from the code included in the SDK?

    Comment by Phillip Pearson — Tuesday, September 2, 2003 @ 7:27 pm

  15. top ten downloads from microsoft.com … hmm … patches, security fixes, hotfixes
    you may as well write a function that randomly selects between any ubiquitous part of windows, and add “buffer overflow vulnerability fix” to it…
    on a serious note: Mark2’s comment took the words pretty much out of my mouth.

    Comment by Patrick H. Lauke — Tuesday, September 2, 2003 @ 7:44 pm

  16. Phillip, good question. The EULA has two relevant sections. Section 2 grants the right to use and modify the Samples source code “for the sole purposes of designing, developing, and testing your software product(s), and to reproduce and distribute the Sample Code, along with any modifications thereof, in object and/or source code form.” Section 3 then places restrictions on redistribution if you choose to exercise your Section 2 rights, for instance section 3.1(ii) which specifies “that the Sample Code [being redistributed] only operate in conjunction with Microsoft Windows platforms.”

    By which I take it that using sample code and modifying it for another platform is fine as long as you don’t redistribute the modified version that runs on another platform. What is unclear is if you write clean sample code that is not derivative of the provided samples, but that runs on another platform. I will be investigating this before I release any software based on this release, and will discuss anything I learn on my blog.

    Obligatory disclaimer provided by our lawyers: This posting is provided “AS IS” with no warranties, and confers no rights

    Comment by Tim Jarrett — Tuesday, September 2, 2003 @ 8:11 pm

  17. I can’t think of any questions that an AI couldn’t solve that another computer could ask. There would always have to be a finite number of types of questions that would be asked. Then it would just be some random variation of parameters.

    To use the “find the one which doesn’t match” problem as an example, you could change the group and the individual items. Some items could even fit in more than one group. For example, glass is clear, hard and fragile. There would still only be a limited number of groups, and a limited number of items to fit in them. If you had too many subtle groupings, even a human would start to have trouble guessing correctly.

    A spammer could write a tool which would get help for the first 100 or so problems and then learn enough to function on its own from then on. Even if it only got it right half the time, who cares? It’s a machine and can try repeatedly for as long as you like.

    On the other hand, I don’t think that most people who write spambots are into artificial intelligence. It might work after all.

    Comment by Theran — Tuesday, September 2, 2003 @ 8:17 pm

  18. The “read this graphic and type in the letters” is much more usable if done with words than with random characters. Native speakers of whichever language is used generally need only look at the graphic once and can type from memory. Non-native speakers have no worse of a time than they would with random characters. Non-western equipment might require something else.

    The accessiblity to blind users is kinda ridiculous. Couldn’t blind users just ask someone for 15 seconds of assistance?

    These images generally include “noise” so as to be immune from OCR.

    But the whole thing seems a bit overdone inmost cases considering that an individual could open at least 1000 accounts per day manually. I can see that it makes sense for Ticketmaster wher scrip-kiddies pound away trying to tie up Springsteen tix.

    Comment by pb — Tuesday, September 2, 2003 @ 8:56 pm

  19. Trackback by Blah on steroids
  20. a python wrapper for this would be awesome, but knowing microsoft, they might not take to kindly to someone doing something that will allow people to avoid buying their crap.

    Comment by vlad — Tuesday, September 2, 2003 @ 9:42 pm

  21. A software company that builds dev tools requires a license to download sample code! And he includes a legal disclaimer at the end of his comment. Talk about a company run by lawyers…

    Comment by Troy Hakala — Tuesday, September 2, 2003 @ 10:38 pm

  22. I’ve managed to get it downloaded here, and have started on a Python wrapper. However, it looks like it uses WS-Security for the authentication, and I can’t find any Python SOAP libraries that support that, so it looks like this is going to be a little trickier than PyGoogle or PyTechnorati …

    BTW about those images-with-text (”CAPTCHAs”): the one that’s being used for this looks _extremely_ OCRable. Check out the PARC site for some harder ones:

    http://www2.parc.xerox.com/istl/projects/captcha/

    examples here:

    http://www2.parc.xerox.com/istl/projects/captcha/captchas.htm

    Their new one, “BaffleText”, uses a language model to generate plausible-sounding words, so it doesn’t have the problem Mark’s talking about (without being vulnerable to dictionary-based attacks). They still suck for blind people, though.

    Comment by Phillip Pearson — Wednesday, September 3, 2003 @ 12:12 am

  23. Progress: I have the Microsoft site responding to a request from Python, using the ZSI SOAP library and a hand-written WS-Security header. Currently GetVersion() and GetCultures() work.

    Comment by Phillip Pearson — Wednesday, September 3, 2003 @ 6:40 am

  24. Progress: GetTopDownloads() works.

    Comment by Phillip Pearson — Wednesday, September 3, 2003 @ 6:56 am

  25. Phillip: please share!

    Comment by Sam Ruby — Wednesday, September 3, 2003 @ 8:52 am

  26. Even the suggestions I made for password-verification systems aren’t so hot, but there *is* a literature on the topic.

    http://joeclark.org/book/sashay/serialization/Chapter06.html#h2-6620

    Comment by Joe Clark — Wednesday, September 3, 2003 @ 9:11 am

  27. Sam: Sure!

    http://www.myelin.co.nz/microsoft_com_py/

    I’ve tested it on Python 2.2 and 2.3 on Debian woody and Python 2.2 on Windows XP Pro.

    Supports all four methods. Returns reasonably Python-like objects.

    Suggestions / improvements welcome.

    Comment by Phillip Pearson — Wednesday, September 3, 2003 @ 9:32 am

  28. Announcement and comments: http://www.myelin.co.nz/post/2003/9/4/#200309041

    Comment by Phillip Pearson — Wednesday, September 3, 2003 @ 10:12 am

  29. You don’t need VS.Net to read the documentation. You can use a tool like Helpware’s Far
    (http://www.helpware.net/) that will decompile those .HxS files into the underlying set of HTML, JS, images, etc.

    Comment by Eric Promislow — Wednesday, September 3, 2003 @ 9:20 pm

  30. Actually, you can’t get that far, because the .HxS files are installed by an installer that will only install over VS.NET.

    Phil found this, though: http://ws.microsoft.com/MsComService/MsCom.asmx

    Comment by Mark — Wednesday, September 3, 2003 @ 9:45 pm

Respond privately

I am no longer accepting public comments on this post, but you can use this form to contact me privately. (Your message will not be published.)



Recent Stuff For You, Special Price Stay Here
  • Greasemonkey Hacks
Good Stuff Buy The Cow Go Away
Dive Into Python
Powered by Google Drink The Milk Don't Steal

 

posts / comments
© 2001-8 Mark Pilgrim