Wednesday, November 5, 2008

OpenID Troubleshooting for openid4java

On deployment of the bioMoby Annotator, ran into some interesting problems with OpenID.

Firstly, for those of you who are trying to figure out why Yahoo is giving you the

"Warning: This website has not confirmed its identity with Yahoo! and might be fraudulent. Do not share any personal information with this website unless you are certain it is legitimate."

I recommend having a look at the Yahoo OpenID FAQs

I also recommend the following articles:

Why Yahoo! says your OpenID site's identity is not confirmed

If you're using Openid4java from sxip, try this:

If Yahoo is still giving you errors, consider their extra requirements:

Yahoo! will only support Relying Parties running on webservers with real hostnames (IP addresses are not supported) running on standard ports (Port 80 for HTTP and Port 443 for HTTPS).

This means localhost will not work and if it's running on say... port 8080, you'll need to figure out a way to forward from port 80 to port 8080

Your web app also needs to be available from the internet. You can test this by trying to access your website from another computer outside of your intranet.

So in the end: if your web address looks like http://localhost:8080/yourrealm/openid , it's not going to work. it needs to look similar to

If Yahoo is doing a discovery of your Yadis document on your REALM. ie. your index.html, make sure the header x-xrds-location is inserted into the head.

Again, if you're using Java Servlets and Apache Tomcat like I am, you'll need to do this using a Servlet Filter:

Instructions on Servlet Filter

Can't tell if it's in the header?
You can type curl -i in your command prompt to check

Another note:
If you're getting:
org.openid4java.message.MessageException: 769: Realm verification

try going into the init method for your openid consumer Servlet
and adding in manager.getRealmVerifier().setEnforceRpId(false); after this.manager = new ConsumerManager();

A good website to go to make sure that an openID 2.0 Provider does work with openid4java is

Sunday, August 31, 2008

Biomoby Web Service Display Page

WebServiceRetrieval.js requires:
jquery.pack.js - v1.2.3 or later
jquery.form.js -v2.12 or later

With regards to biomoby rifraf: Progress is made on the BioMoby Web Service Annotation Page. Finished making the javascript file (WebServiceRetrieval.js) to display a Web Service's descriptive information. Because the RDF graph for a Web Service can get pretty deep and because we want to pull out a lot of information, it slows down the query I've noticed and causes timeouts when accessing it remotely the way we're doing it. Had to break down the query into steps.

1) Query using a web service uri to get its immediate service description, organization description and operation uid
2) Query using the operation uid to get the operation task
3) Query using the operation uid to get the operation Input parameters
4) Query using the operation uid to get the operation Output parameters

Using ajax calls to submit a form containning the SPARQL queries, the script executes the Query 1 and pulls out the Operation uid. Query 2-4 use the operation uid and are all called at approximately the same time (as in they don't wait for one to complete before executing the other since they're all their own seperate ajax call)

Using the jquery form plugin to do the ajax calls which allows the script to work in all browsers: Firefox3, Safari, Opera and IE7.

Small catch: Need to talk to the guys who handle the Virtuoso server where I'm getting this Web Service description RDF as JSON from. Because ajax doesn't let you do cross domain requests for json (which we're doing) UNLESS you set datatype: jsonp.

When datatype is set to jsonp, jquery ajax sends an extra 'callback' parameter in the request to the server. This 'callback' parameter is usually has a value like eg. jsonp7120986. The server is suppose to take that and wrap it's response JSON in it like so:
jsonp7120986( {responseJSON} ) and set the response to text/plain.

If the server just sends back {responsJSON} without it being wrapped, it will cause an 'invalid label' error.

Our SPARQLEndpoint currently doesn't do this and I've been getting around it (so that I could keep coding) by making an intermediate java class which would wrap the JSON response in the callback. I'm sure the guys managing our SPARQL Endpoint will be able to help.

Now to make it look intuitive and beautiful...

Tuesday, August 26, 2008

Instructions for installing ED Wiki

Setting up Entity Describer wiki is up. Start with: InstructionsforED

Revisions necessary?


biomoby rifraf

At some point in the next few weeks we will be building/using a customized ED web page to conduct an experimental annotation jamboree on the biomoby web services.  We will be both measuring the differences in collected data between normal social tagging and semantic social tagging and providing a new layer of annotation for the biomoby framework.

The soon-to-be-created biomoby service annotation web page is an example of one application of the ED code (now available on Google code).   We hope that others will find many more uses for it over time.  When its ready, the little web page that could will

  1. Request a JSON representation of the RDF describing a biomoby web service from a new SPARQL endpoint containing the BioMoby service graph.  See here for an RDF/XML rendition of such a service description.
  2. Display the available information about the service in a reasonably intuitive and beautiful fashion.
  3. Allow the user to tag the service with either semantic tags (e.g. from the service ontology or elsewhere) or free text tags.
  4. Send the tagging information to the ED RDF repository.

The question of what exactly to collect and why is still a little vague however.  Why and how would tags, semantic or otherwise, be useful in the biomoby system?

Moby2.0 developer Luke McCarthy requested that the service tags focus on defining what might be reasonable semantic relationships (predicates) between the input and the output of the service.  An example might be for a a basic blast service, service input a) 'is homologous to' service output b).  

Any other suggestions regarding service semantics of interest warmly requested and appreciated.


Monday, August 25, 2008


ED is holding a code license: New BSD License
and a content license: Creative Commons 3.0 BY

The content license comes in BY and BY-SA

This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered, in terms of what others can do with your works licensed under Attribution.

SA: This license lets others remix, tweak, and build upon your work even for commercial reasons, as long as they credit you and license their new creations under the identical terms. This license is often compared to open source software licenses. All new works based on yours will carry the same license, so any derivatives will also allow commercial use.

For all the licenses:


ED is now recording the aliases for a term. Aliases are being recorded as an rdfs:label attached to the topic (the guid of the term).

Google code

There's the site for ED on google code. The All the code for ED is containned within the surprise surprise 'ED' folder in the trunk. I've also created a new folder within the trunk that contains the javadocs which can be reached at:

which is linked to in the Entity describer google code project home page.

The IcaptureUtils project that the Entity describer uses, I made into a jar file and included it in ED's referenced libraries

Monday, August 18, 2008

Specify your own SPARQL Endpoint for ED

Changed the file to include specification for the SPARQL Endpoint.

SPARQL_EP_query_param_name = query
SPARQL_EP_misc_params = { \
"default-graph-uri": "sandbox", \
"should-sponge": "", \
"format": "text/html", \
"debug": "on" \

SPARQL_EP_url specifies the SPARQL Endpoint URL

SPARQL_EP_query_param_name specifies the name of the query parameter used by the SPARQL Endpoint.
ie. the name in the name/value pairs used when submitting a SPARQL query as a URL parameter

SPARQL_EP_misc_params is a JSON object with all the name/value pairs that the SPARQL Endpoint needs for URL Parameters

These specifications are used to set the same parameters kept in a SPARQL_EP_params JSON object in the ed_connotea_patched.js

Friday, August 15, 2008

Finished the basics for the ED Widget.
It's in two JS files:
-EDwidget (which needs to be included in the html page and handles the adding of all necessary css and javascripts on the fly)
-EDlet (this creates the actual EDlet inside the div with id="EDlet")

To use it: (Haven't deployed it yet, so don't expect to use it right away)

include the:
a script tag in the html head with
type = "text/javascript"
src = ""

include a div somewhere with id="EDlet"

I'm leaving up the stylizing of the EDlet div to the web developer using it in their page, but I have include some basic EDlet API functions to make it easier.

Note: You need to be logged into ED for this to work since it requires a cookie

Wednesday, August 6, 2008

Connotea Seperation

Went through the new ed_connotea.js.template javascript and finished modifying it like I did the previous ed_connotea js so that it works with the Connotea free ED that I'm working on.

Tuesday, August 5, 2008

URL Limit Solution

After having a talk and some code Review with Eddie on Friday, he offered an interesting solution to the URL Limit which doesn't require ajax or any modification to the servlets.

He suggested that I submit the tagging xml to the Tagging Servlets (TagManagerSaveTaggingServlet and SPARQLTaggingServlet) as a file. This would involve submitting the whole form to the TaggingServlets, which would then contain an input specifying file. Great! But... when you create an inputfileElement, it comes with the browse button which allows you to pull files from your computer. We don't want this though. We want to create a file on the fly and feed it into the input stream of which the inputfileElement uses.

I did not find out how to do that.

However, the following solution is still much like the file solution, but instead of making a file on the fly and putting it in a form (also made on the fly) and submitting said form to a url within a designated iframe....

-create a form on the fly with action=TaggingServlet and target=iframe
-create an input of type hidden on the fly with the tagging xml as its value
-attach the hidden input to the form
-attach the form to the body
-submit the form

Surprisingly... this works! I've yet to find (through testing or research) a limit for the size of an input in any browser. Further testing may reveal otherwise, but IE isn't complaining, which is rare.

The other plus side is the tagging Servlets don't need modification since it works in the exact same way as before except the 'tagging' parameter is nolonger in the url, but as part of a form, which as far as the Servlet is concerned, is the same thing!



for now....

Tuesday, July 29, 2008

Internet Explorer 7 URL limit

Here's the catch....

After much testing, it comes down to this. The tags aren't inserting into IE7 because the URL size limit in IE 7 is 2048 which is weird because I'm getting stuck around 1600. This is for Get methods in IE.

Monday, July 21, 2008

Accessing your webapps on your Mac OS X from your Parallels

If you're using a mac and running Parallels and want to test your web apps in IE without having to package it in a war and move it over to your Parallels side...

You will find this INCREDIBLY useful:

Andy Peatling's HOWTO access your webapps on Mac OS X from Parallels Windows XP

And if you're easily confused like I am...
DocumentRoot refers to your CATALINA_HOME/webapps/
ServerName is whatever you want it to be

With ED, I set it up locally as (for example) www.entitydescriber.local
so as opposed to using localhost

I can now do http://www.entitydescriber.local:8080/ED/manualURI

Of couse, all this configuration with IPs is only really useful if your IP doesn't change. If it does like with me. then you need to reconfigure when you get to work for your work IP and home for your home IP

Friday, July 18, 2008

Error Handling

Getting 504 Errors from our Virtuoso OpenLink SPARQL Endpoint at
And the Tomcat Manager's down.

So I've taken this opportunity to test out some error-handling in ED. Mostly just refining the E-mail function I have set up to e-mail to any errors that may popup in the SPARQLTaggingServlet. Usually it comes down to an error in trying connect to the SPARQL Endpoint in which case I fire off an E-mail to the developers gmail with
-the Response Code
-tagging XML
-Time of Error
-Extra Information (Insert statement, HTML returned from the connection (usually contains some clue as to what happenned) and some information about where the error occurred

Put in a variable and some if statements to activate and deactivate the use of the SPARQL Endpoint in case it goes down, but we still want ED to work with just ED Database on arch. In addition to this, I've decided that any errors codes returned from the SPARQLTaggingServlets or Timeouts while submitting - I'll just let it go through to connotea anyways and hide the fact that there was an error from the user.

1) The tags were submitted to ED Database via Jena already
2) If there was an error in the SPARQLTaggingServlet - it has been e-mailed to the developer's email, recorded and I will know about it
3) People can still keep using ED even if the SPARQL Endpoint goes down

which seems to be happenning a lot lately


The web methods for the ED API have been committed, but not yet deployed.

Interesting thing I ran into when trying to do an HTTP Get on google book search using Java's URLConnection.

Google adheres to a slightly different API from other web resources when you try to connect to it.

Found out about it here:

The suggestion in this thread works for google and searchmash (which is a cool little google API that returns results as JSON to you)

This suggestions works for google book search:

URL size Limit

In an earlier post, I mentioned that one of the issues of why the ed_connotea javascript would not call the SPARQLTaggingServlet was due to the size of the parameters. The xml being attached as a parameter was too long (as it turns out, it was because the xml was being duplicated and appended to itself - effectively increasing the size)

Here's some specs I found for different browsers and parameter size limit:
Site's a little old. but the main point is that there IS a size limit and that it might be better to find a different way of submitting the xml other than as a parameter, unless we're guaranteed the xml will always be under the size limit of all the browsers we intend to support and the server of which we're using.

The site where I found the specs

Internet Explorer:
Firefox at least: 100,000
Safari at least: 80,000
Apache WebServer 4,000

Will need to do some testing myself to see what the ACTUAL limits are.

Sunday, July 13, 2008

Why there was an unexpected limit on the number of tags you could add to ED

I didn't realize this until I started making a progress bar to ED as per Ben's suggestion. Originally, I suspected ED was slower with the submiting of Tags because I added in that extra step of adding tags to the Virtuoso Server with SPARQL. As it turns out, some of the tags were being repeated, so the SPARQLTagManager was adding a tag 2 or 3 times!

I traced this problem back to the xml the ed_connotea javascript was feeding it. What's worse, if you have too many tags with too many types associated with them, then the xml would get to be too big and it can't be passed as a parameter when we call the SPARQLTaggingServlet.

Firefox doesn't state that there is a limit on the size of the parameter.
But there is one for IE.

The trouble with the xml was that every time I called save_tag_action() to make it (I didn't realize it was appending the tags on to the existing one so I'd end up with duplicate tags. It's called once for the TagManagerSaveTaggingServlet and again for the SPARQLTaggingServlet. To alleviate this problem, I call the save_tag_action() once, save the xml and re use it for both servlets.

I wonder if it would be prudent to keep the xml in a cookie and destroy it once we're done.

Friday, July 11, 2008

Ajax and ED

It's time for more Ajax...

Here's the problem I'm running into with ED. When the user hits the submit button. The javascript calls a Servlet which composes the SPARQL/Update Insert statements and then submits them to the Virtuoso Server.

I have to break up the Insert statements into an Insert for Tagging information (taggedBy, taggedResource, taggedOn, etc) and an Insert for each Tag in case it doesn't already exist. The Tag Insert can be a small Insert statement or a big one depending on how many types it has. The more types it has, the bigger it is because I have to create the Type in case it doesn't already exist.

So for each insert I'm doing a Post to the Virtuoso Server awaiting reply before I continue and send off the next one.

Thus, with the more tags there are, the more Posts I'm making, the longer it takes, especially if it takes a while for the information to go across the wire and back. (What if someone's tagging in Africa!). I COULD spawn several threads to do a submission for me for the tags to avoid the "stop and wait" protocol I'm using right now...


as Ben suggested. I can use Ajax!

Question is. How do I use Ajax? Well here's what I'm thinking right now...
Ajax let's me talk to the Server and tell it to do a function without having to reload a page.

When someone adds a new tag. I'll use Ajax to submit that tag (since a tag is independent of a user until you make a connection using the "associatedTag" property) to Virtuoso. I'll leave the associatedTag property for later when the user ACTUALLY hits submit.

If the tag already exists, *shrug... user doesn't know.
If the user deletes the tag... the tag will still be added, but it won't be associated with the actual tagging, since it won't be included in the submit.

Trouble: What if a post to the Virtuoso Server fails? I guess I'd have to check for that...

Emailing Errors!

I can see this being advantageous to ED in the long run.
I've gone and implemented an EDemailer class that uses Javamail to send off emails.
Whenever I get an error in either my SPARQLTagManager or SPARQLTaggingServlet, the EDemailer's send function is called and an email is fired off to an account I set up. ""

The information I pass on is the GMT time it happenned (to conform with what is being recorded in the database), the xml that was passed into the SPARQLTaggingServlet, the response code (if it gets that far) and the SPARQL Insert statement (if it gets that far).

I'm using "" as the SMTP host (Port 465). You have to authenticate with them with a gmail account in the code or else you can't use it. Since I'm using JavaMail I had to get the activation.jar and mail.jar. Supposedly J2EE comes with it, but that's a lie. So I had to get it from sun's site. On top of it. If you put activation.jar and mail.jar in your lib under Web-INF, it won't work! Those jars NEED to be in the Tomcat's lib folder. AND ONLY in Tomcat's lib folder. If you don't do this, you will make your mailer program sad!

If you're going to be using a smtp host that's running on your localhost. you'll need some other configuration for that, that involves Tomcat's server.xml and context.xml.

Haven't tried that out yet, so I won't make any claims.

But at least this way, when things go wrong. (and assuming I'm checking the ed.developers gmail account (forward it to my normal later)) I will be informed when things go wrong.

Still need to deploy this on arch.uwindsor of course.... Hopefully if I put the extra jars in it's lib it won't complain....

Monday, July 7, 2008

Malformed/reserved Characters

Connotea doesn't like apostrophes (') and backslahes(\). Actually, I've successfully posted tags with apostrophes in them, but it just doesn't like it when ED does it.

As a fix, we've gone and modified the javascript to scan for the Semantic Tags for these characters and remove them prior to submission to Connotea.

Delay time in submitting to Connotea

Now that we've got ED submitting tagging information to the sandbox graph of the Vrituoso server, in addition to submitting it the ED Database, I've had to put in a delay to prevent the Form from submitting until all the information has been submitted to the sandbox and confirmed that it was successful.

Trouble is... this can take a while since I need to break up a submission into a SPARQL insert statement for every tag since the number types for a tag are variable. making an insert statement potentially very very long.

And if you have A LOT of tags. The wait for a response can get even longer.

I set it at 20 seconds before it times out and an error page is presented to the user.

Before I was able to add up to 6 or 7 tags between 2-4 ftypes each and it would be under 10 seconds. but just recently Ben showed me an error that he got on Entity Describer, that went past the 20 second mark.

I'll have to test the capacity of tags + ftypes I can submit before I have say it's taking too long, but if need be I can analyze how long a tag insert is going to be and start merging them together if they're short enough. Seems a waste to do a 1 tag 1 ftype insert for a whole hTTPPost.

Monday, June 30, 2008

The SPARQLTaggingServlet with ED

Now that the SPARQLTaggingServlet and SPARQLTagManager are complete, I've gone ahead and integrated them into the rest of ED. I call the SPARQLTaggingServlet as I would ED's current TagManagerSaveTaggingServlet. EXCEPT! I stop the form from submitting to Connotea until it's confirmed that all the tags have first been put into our Virtuoso database. Basically this involves having to 'return false' on all the functions bound to 'submit' so that the form doesn't submit. and then setting a setTimeout to check for a response from SPARQLTaggingServlet saying it's okay to submit before calling submit from the javascript.

Interesting point: Ben discovered an error in this new addition. When he tried adding a new bookmark, Tomcat went crazy saying it was low on memory. Although I did some basic testing on it, I guess I didn't do enough... I suspect that it's because I forgot to close something that needs to be closed.

Thursday, June 26, 2008


I've added in 1 new class and 1 new servlet to the org.icapture.ED.tag package. is a servlet that functions in the same way as is a java class that functions in the same way as the

When a user hits the submit button on ED - the javascript normally calls the TagManagerSaveTaggingServlet (passing into it all the tagging data in xml format as a parameter). SPARQLTaggingServlet will be used in the same way. Call it in the same way with the same xml. SPARQLTaggingServlet acts as Servlet wrapper for the main class which is SPARQLTagManager. It creates a Tagging object with that xml and passes it into SPARQLTagManager which processes it into SPARQL/Update (SPARUL) INSERT statements and posts them to a Virtuoso server via "" as parameters which are then executed.

If it is successful (determined inside SPARQLTagManager via response codes), all inserts were made successfully and it will return the ResponseCode from the Virtuoso Server (200 if successful, anything else otherwise) to the SPARQLTaggingServlet.

Tuesday, June 24, 2008

Manual URI entry Form for ED!

Forgot to post about this earlier.
Did up a manual URI form for Entity Describer so that you can enter a URI manually.

URL being ''

Posts the URI to Connotea and ED Database as well and requires that you enter a URI and Title.

Problems I ran into:
Not sure how to circumvent Connotea's fixsize() function which resizes your browser window. I can see that becoming really annoying in the future.

Any ideas?

Monday, June 16, 2008

Hash instead of SessionID

With regards to security of ED, Ben suggested we drop the sessionID cookie because it places a need on the server to keep that sessionid (which will eventually expire). Since what we want is to be able to log into ED and stay logged into ED indefinately by relying solely on the cookie that we pass onto the client for authetication. (This may prove troublesome for people who decide to use computers other than their own and don't log out - effectively not deleting the cookie either).

Instead, I've gone ahead and had a look at how Connotea does it using a Hash value in a cookie to use for authentication. To learn more on it I googled "hash" and "cookie". Several interesting recipes later... I decided to refine my search. Just another example of how big a difference context and semantics make in the web!

Lucky me since Java libraries already provide me with a means of generating Hash values via the MessageDigest class. I generate a hash value using the user's openid, logintime and a secret value known only on the server's end. I then create an ed_hash_cookie for the user which contains the openid, logintime and hashvalue.

To authenticate for a restricted page, we look to see if the ed_hash_cookie exists. and then recaculate the hash_value from the cookie's openid, logintime and our secret value to make sure it matches the hash_value in the cookie.

This will make it harder unauthorized users to use ED since it requires them to make a cookie with a username, logintime and hash value. Unless they know the secret value used by ED to calculate hash values, it will fail the authentication when the server re-calucates the hash and finds a mismatch with the cookie's hash value.

Of course, this is still vulnerable to the same attacks as described earlier regarding cookie theft. Someone could still potentially sniff out a hash_cookie (er... no pun intended) and use that to get into ED.

I'm using the MD5 algorithm currently to calculate the hash and encoding in Hex, but I'm going to switch to SHA given that SHA is the successor to MD5 and used in TLS,SSL and other security applications.

Thursday, June 12, 2008


Note to self: ED 2 Connotea uses existing javascript from Connotea. ie. the addtag(tag ,clear) function

For EDnoConnotea (ED seperated from Connotea) I've created a javascript 'copiedFromConnotea.js' with... as you might guess, javascript copied from Connotea.

Will remove superfluous functions later.

Wednesday, June 11, 2008

OpenID Providers

This is a list of the OpenID Providers I've been using for testing:



For Connotea: I've noticed most OpenID Providers work, although Yahoo, myVidoop, and claimID seem to give it problems. This may be because it doesn't know how to deal with https://

So far the only ones I've had trouble with are Vidoop, which may be (as Ben suggested) because of a missing parameter.

June 16, 2008

Figured out what was wrong. bracket. Wrong place. Funny how still let me through. It's probably because they're still operating off version 1.0 of OpenID and not 2.0, which is where my mistake occured. But praises to Vidoop for replying to me!

Tuesday, June 10, 2008

ED, Connotea, OpenID

I've added in a link to for the current ED to login with an OpenID.
Really all it's doing is submitting the OpenID to Connotea to handle the rest of the work.
But the trouble is, once I've passed it off to Connotea, it's out of my control, so I can't have the same window/tab redirect back to ED once it's done authenticating.

So what I've done is I have is the AddToED servlet opening in a new tab/window while the other one handles the authentication.

I imagine this will lead to confusion later if the authentication doesn't go through since people will simply continue on with ED and THEN find out they haven't actually logged into Connotea once they've submitted their bookmark and annotations.

Another issue I ran into with Connotea, was that it doesn't like all OpenID Providers. I've benn trying the more common providers: Yahoo, Vidoop and myOpenID, but so far the only one that has worked is myOpenID.

And yes... I did make sure to change the OpenID in the Advanced Setting of Connotea before trying each one.

Monday, June 9, 2008

About FetchRequest of OpenID

With regards to an earlier post:

Using a FetchRequest:
This is how it's done in version 2.0 and this is what all OpenID Providers use. Currently I haven't got this working yet, but as I found out, it was because I don't have the lastest version of OpenID4Java and will have to get the latest from their SVN off of googleCode. (I'll update this as soon as I get it working) This googlegroup post explains it in more detail.

I checked out the latest Openid4java off of their SVN on GoogleCode at: openid4java-read-only

Went into my console, went to the project I just checked out and typed in:

>ant jar

to build a new jar for java-openid-sxip. The jar will be located in the build folder as: java-openid-sxip. I replaced the existing java-openid-sxip jar in my referenced libraries and as a result, it has fixed the earlier problem I was having with the FetchRequest not working. This was because my parameters were showing up:

key: openid.ns.ext1

when it should have been just

key: openid.ns.ext1

The -draft7 that sxip left in there as a download prevents the FetchRequest from working. The only trouble I'm running into now is Yahoo! identifying my website as unsafe and not sending it user information. This is most likely because I'm running the webapp locally under localhost, which is not a valid url for openID to Yahoo! and thus considers it an unsafe website.

Sunday, June 8, 2008

Entity Describer and GoogleCode

Ben suggested putting Entity Describer up on GoogleCode to make it a publicly available tool (which technically it is right now)... But on GoogleCode people will be able to contribute stuff to it! and as Ben put it, "It'll make sure we keep good coding style, since other people will be looking at and using it".

I've had previous projects on GoogleCode because it makes source control for school group projects so much easier, but I couldn't remember if there were any rules ED should be concerned about, so I had another quick look.

-Anything that goes on GoogleCode is Open Source.
-We can terminate the project whenever we wish to and take it off GoogleCode
-GoogleCode can terminate our project if they catch us doing something illegal
-When you create a project, you choose an open source license (
GNU General Public License v3 , Apache License 2.0)
-On set up you establish the project owners and the project members who are both capable of contributing/making changes to the code
-Anyone may download a READ-ONLY copy of the code (ie. Can't commit changes to GoogleCode for this specific project)

Looks good to me.

Terms of Service

Friday, June 6, 2008

Trying out using OpenID to fetch user information

OpenID lets you log into an existing account on a website using an OpenID URL. It can also be used to register you with a site that you have NO account with. Normally when you sign up for a site you have to create an account with profile/persona information. UserName, Password, email address, Firstname, lastname, DOB, etc etc etc. However, using an OpenID URL on a site you've never been to automates this process.

This is what happens:
-The site (aka Relying Party) recognizes you're new and don't have an account with them.

-The site contacts the OpenID Provider asking them to authenticate you, it adds on an extension asking for your information: UserName, email, address, Firstname, lastname, DOB, etc etc etc.

-Your OpenID Provider SHOULD prompt you for permission to give this information to the site and lists out the information the site wants.

-You click "Yes" and the site automatically creates an account for you using that information.

Because ED may very well have a user management system in the future and want to keep to information on its users ie. e-mail. I've left hooks in the OpenIDHandler to do this. Some things you may want to know about:

There are two different ways of fetching user information from an OpenID Provider.

Using a SReqRequest:
This was how it was done during OpenID version 1.0 and a lot of providers still allow it such as: Vidoop and myopenid, but other ones like Yahoo do not.

Using a FetchRequest:
This is how it's done in version 2.0 and this is what all OpenID Providers use. Currently I haven't got this working yet, but as I found out, it was because I don't have the lastest version of OpenID4Java and will have to get the latest from their SVN off of googleCode. (I'll update this as soon as I get it working) This googlegroup post explains it in more detail.

What's the difference?
FetchRequest gives you a few more functions to get more information if you want. Such as being able to get more than just one e-mail address.

I've put in both as hooks and for testing I've put both in a different if statement. Activating and deactivating them depend on the values in org.icapture.ED.openID.Constants

Thursday, June 5, 2008

OpenID, Entity Describer and Security

Originally I wanted to publish restricted sites under the web.xml and take advantage of HTTP's existing authorization and authentication functions. This way 401 or 403 errors would be produced on trying to type in restricted URLs without authentication. However I have not found a way to use these basic authentication schemes (Basic, Form-based, digest and Client Certificate) without implementing a username and password and leaving all the authentication work to be done by these schemes even though authentication is already being done by OpenID.

The next method I looked at to restrict site access was using the SessionID. So as it stands, as soon as ED receives the the authentication from the OpenID Provider and ED has verified it, ED sets a sessionID cookie for the user. So long as that cookie exists, the user may access ED without re-logging in. On logging out, the sessionID cookie is destroyed and the session is invalidated. This method requires that at every restricted page there be a check to determine if cookie sessionID from the client and the sessionID from the server match. If the session is invalid or the cookie sessionID does not match, the user is redirected to the login page.

Cookie Theft
The major concern with keeping a cookie for the sessionID is that a person may determine the sessionID from packet sniffing, leading to possible impersonation of a user. One way around this is to use SSL or TLS which encrypts data going back and forth between the client and server, thus encrypting the cookie. This service is requested of the Web Server which hosts your website.

If anyone has any other thoughts on how to implement security, I would be more than happy to hear it.

OpenID and Entity Describer

So in the last week, I've gone about the task of setting up and testing openID to for the next Entity Describer minus Connotea. Fortunately there was a lot of forum posts and advice on the web on how to get OpenID set up for a website.

A good overview of it is: A recipe for enabling OpenID on your site
Since Entity Describer is programmed using Java Servlets, I used the OpenID4Java library. This one in particular because it specifically state version 2.0 which is what most OpenID Providers conform to now. However, for other programming languages a complete list of libraries may be found here.

After installing the libraries, the rest was implementing the servlet to handle the openID login. I've saved it under the ED project as package org.icapture.ED.openID as Most of the example code I was following came from the library I downloaded from OpenID4Java. Particularly under the INSTALL file and the in the examples folder.

The workflow for OpenID goes as such:
-User requests the OpenIDHandler Servlet and a ConsumerManager object is instantiated. The ConsumerManager is the main OpenID object which handles most of the communication between ED and the OpenIDProvider(OP).

-The user submits an openID URL eg. (

-Our website (also referred to as a Relying party or Consumer) extracts from that who the OpenID Provider (OP) is and does a discovery on them to make sure they exist. This is done by calling the discover function of the ConsumerManager.

-Once our website knows the OP exists it forms an AuthRequest message to request Authorization from the OpenID Provider. This is done by calling the ConsumerManager's authenticate function. This will cause the OP to do a GET on the returnToURL you passed into the authenticate function. With ED I gave it the OpenIDHandler's URL with the parameter "?return=1". Bare with me, this is where it gets interesting because I'm going to start talking about XRDS documents and YADIS protocol

-When the OP is doing a GET on the returnToURL, it is expecting to get an XRDS document from it. What's an XRDS document? it stands for
eXtensible Resource DescriptorS. It's also known as a Yadis Resource Descriptor. In it I've specified the returnToURLs OpenID may return to after it's done authenticating (I just tell it to go back to the OpenIDHandler servlet). It's used to let the OpenID Provider know that the returnToURLs are using openID.

-There are a few different ways of giving the xrds document. But according to the Yadis Protocol you should do this by adding the header "X-XRDS-Location" to the response with the value being the URL where your xrds document is located.

response.addHeader("X-XRDS-Location", xrdf_path);

In ED I've named the document: Yadis_xrds_doc.xrdf in /WebRoot/ed. I've also added the mime type for .xrdf extension under the web.xml


-*Please note if you put a breakpoint during this GET. It will cause the authentication to fail.

-After all that the OP has now verified that you're OpenID enabled and you've got an AuthRequest object. What to do next? That depends on the version of OpenID the OP is using. Usually it's version 2.0, but ED handles both version just in case. If the OP's version 1.0, just redirect to the OP's URL that's responsible for handling authentication (also referred to as OPEndpoint) like so...


If it's version 2.0 then it's a little more complicated. Basically what happens is you add 2 attributes to the HTTPrequest. One being the OPEndpoint and the other being the parameterMap you get from the AuthRequest. Then you forward the request and response to a dispatcher. A dispatcher being just another page that uses the same request and response and does the work. With openID version 2.0 the dispatcher MUST have a form with a sumbit button and the parameterMap is turned into inputs of type HIDDEN inside the form. The form posts to the OPEndpoint.

-The OP handles the authentication from here and returns control back to the URL designated as the return to URL. Here, ED verifies the Response from the OP by calling ConsumerManager.verify. This is to make sure it DID come from the OP we sent to. If it is verified, the user is authenticated and ED changes state to logged in.

Useful Links:
OpenID4Java JavaDocs

OpenID version2.0 Specs

Brief Description of what YADIS is

Wednesday, June 4, 2008

hello world!

Hello fellow ED developers! 

Introduction to openID

The rundown:

What is it?
OpenID lets you log into (or even register with) other websites that supports openID by just using an openID URL (eg. instead of a username and password. This does away with the need for having a seperate identity (username and password) for each website that requires you to register with them. Websites supporting OpenID will usually have a link to a openid login page below their normal username/password login page. You get an openID URL by registering with an OpenID Provider such as yahoo or myopenid.

With openID, a website (the relying party or "consumer") delegates the task of authentication to your OpenID Provider (the website that you have an openID with). You can even register with a website you don't have an account with since the openID Provider gives them the registration information that website needs, such as address, e-mail or full name.

Example OpenID Providers:

The official OpenID website
Specs on OpenID version 2.0: Good for definitions