Validator Tutorial: How to Build Your Own Rules?

 

Previous: How to Build CV Mapping Rules ?

Next: Wiring It Together - Bringing All Components Together

 


A. What can these user-defined rules do for you ?

 

Essentially, whenever the CV mapping rules cannot be used to model the validation you want to apply, the Object Rules are the alternative. There is inherently no limitation to what these rules can do, as long as you are able to program them using the Java langage and the plethora of libraries available on the internet.

 

B. Implementing your first rule

The validator API defines a class that one has to extend in order to write a rule:

psidev.psi.tools.validator.rules.codedrule.ObjectRule 

 

The class diagram below illustrate this part of the Validator's data model:

As you can see on this diagram, in order to fulfill the contract of an ObjectRule, you will have to implement the following methods:

boolean canCheck( Object object );

Collection<ValidatorMessage> check( Object object )

The canCheck method allows to define what object type (ie. class) a specific rule is able to validate. The second method 'check' is the one that performs the validation and returns messages if inconsistencies are detected.

 

1. Writing a simple rule

 

So let's define a first very simple rule that only accesses the data available in the provided instance of the data model. In this example we are still playing with our Simple Proteomics Experiment of which the class diagram is available here.

In this first simple rule, we are going to to look into the Experiment and report an error whenever no name has been given.

 

 

If you wish to run this rule yourself, you can download the source code of this sample validator here.

 

2. Writing a rule that does use Ontologies

 

Now let's write a rule that reports the following inconsistencies :

  • If the molecule type is protein (SPE:0326), then if the sequence is defined it has to be composed of amino acid only.
  • If the molecule type is nucleic acid or one of it's children term, then if the sequence is defined it has to be composed of nucleic acid only.
  • If the molecule type is ribonucleic acid or one of it's children term, then if the sequence is defined it has to be composed of ribonucleic acid only.
  • If the molecule doesn't have a sequence (unless it is a small molecule), we report a low severity (INFO) message.

Here is the rule implementing these constraints:

 

 

Please note that in order to keep this code sample consise, we have removed the import section. Please download the full source code if you want to get the complete version.

 

 

B. Configuring Your Set of Object Rules

 

1. The Object Rules Schema

 

2. Example of rule set for the two rules defined above

 

 


 

Previous: How to Build CV Mapping Rules ?

Next: Wiring It Together - Bringing All Components Together

Tags: