General Rules

These are some of the research (curation) rules that are applicable to each file.


For each row in the spreadsheet, we would want you to perform the following checks, if the assertion passes each one it is considered right, if any one fails then the assertion would fail.


0= incorrect

Please assign 1 or 0 for each row into column K.

One row is done for each file as an example.

See below curation rules for each file.

Please expand the Concepts to capture the full concept if necessary, but do not change the columns containing the original TERMS.

Concept needs to match 'concept type' (ie. SDS-page is concept, then concept type should be biomedical technique).

Please change the relationship if you do not think it is appropriate - however - please use a relationship which already exists elsewhere in this file if possible.

Each cell can only contain one concept for example, "interleukin-1 (IL-1), and -6 (IL-6) IS ASSAYED USING Bioassay" - there is no single protein called "interleukin-1 (IL-1), and -6 (IL-6)". 

Therefore, this would need to be curated either as "IL-6 IS ASSAYED USING Bioassay" or the curator should insert extra rows, to add in the extra proteins.

Similarly, the concepts must match the TYPE given, so that if Type 1 is "Protein", Concept 1 must be a single protein name, not a protein  or mRNA level, etc. For example, in the file Protein_A_G_2_Protein.xls (Protein File in the test): 

 TYPE 1                   CONCEPT 1                             RELATIONSHIP TYPE          

Protein         The binding of RF-IgM to animal IgG            IS INHIBITED BY

CONCEPT 2                 TYPE 2 

 Protein A                 Biomedical Technique    

 and the phrase "The binding of RF-IgM to animal IgG" is not of Type

 "Protein" - so this is wrong.  The sentence reads " The binding of

 RF-IgM to animal IgG was inhibited by addition of protein A, which

 binds some animal IgG by recognizing the junctional site on CH2-CH3

 domains in the Fc region" - so for this, the curator could have:

 *          scored the assertion "IgG IS INHIBITED BY Protein A"  as 0

 *          curated the assertion to read "IgG IS AFFECTED BY Protein A" 

 *          or taken an assertion from the same sentence to get "Protein A BINDS IgG"

Different Parts of the excel spreadsheet test have different RULES and the different parts are named here in red and described below:


VP_builder File:


Please expand Clinical Technologies to more specific concepts if required.

Thus if the identified term is CT, but the Technology used is clearly a more specific form of CT (eg: "Cine-CT"), then please replace the  normalized form of CT (Computerized Tomography) with "Cine-CT".  However, do not expand to include ligands/contrast agents: Thus "FDG-PET" (ie: PET using Fluoro-d-glucose as a contrast agent) should  remain "PET" (and the normalized form remain "Positron Emission Tomography").

 Oncological Disorders

Also, please expand Oncological Disorders to more specific concepts

where appropriate. Thus, if the term "tumour" has been identified, within a sentence containing "tumour of the left lung", then the  Concept would most sensibly be expanded to "tumour of the lung". (but  don't bother about which lung). Also, don't bother with terms such as "advanced, malignant, acute, chronic, etc" unless the sheet already  has identified them as existing terms. Many abstracts use "Tumour",  "Cyst" etc as pronouns for very specific forms of tumour or cyst,  which are referred to elsewhere in the abstract (often in the title).  In these cases, please extract these full and specific forms, and insert them in place of "Oncological Disorder" or "Cyst".

Species: We are looking into Human only.


BW0603 file

This is a file of candidate assertions of the type...  "Oncological Disorder- to - Process", and we're looking for data like "Lung cancer - leads to/causes/etc - Oedema". etc.

Species: We are looking into Human only.

Oxidation File

contains the following connections around Protein Oxidation: 

Protein Oxidation -> Protein

Protein Oxidation -> Compound

Protein Oxidation -> Formulation type, e.g. liquid, solid, lyophilized Protein Oxidation -> Physicochemical Property, e.g. acidic pH, turbidity, viscosity Protein Oxidation -> Biomedical technique, i.e. assay

Goal is to get as much context around protein oxidation as we can - the proteins that it affects, how it is measured, compounds that induce or inhibit the process, whether it occurs in liquids, solids, etc. Wherever possible, we are also interested in saying which residue is affected by the oxidation process. For example, you'll see that I've created assertions for Peptidyl-Methionine Oxidaton and Peptidyl-Cysteine Oxidaton. If the article refers to oxidation of a specific amino acid, create this as the new concept in the format Peptidyl-<name> Oxidation. Generally, you find that oxidation affects Met, Cys, Tyr, Trp and His, but if others are ferred to this will be interesting. You have to be careful here that the oxidation is affecting a peptidyl amino acid, and not just an amino acid in free solution.

Some of the assertions will have relationships, whereas other will not. As well as correcting any relationships as appropriate, we need to create new relationships for those that do not have one (although only for those that are good assertions) such as: AFFECTS, IS AFFECTED BY, IS OBSERVED IN, IS OBSERVED BY, IS ASSAYED BY, INCREASES, REDUCES, CAUSES, CHANGES,etc.

You'll find that there's quite a bit of redundancy in the file, where the same assertion has been identified from different sentences in the same article. If you've already created an assertion, e.g. Protein Oxidation

- AFFECTS - Albumin, then just mark any analogous assertions from the same article with a zero. This way, you'll find that you get through the file much quicker.

Some of the concepts are named 'Oxidation Gap' - these derive from gap-filling terms supplied to us by the customer. If the assertion is a good one, please change this concept to something more appropriate, such as Protein Oxidation, or one of the Peptidyl-amino acid Oxidations that I talked about earlier.

A point about pH values, if the article provides us with a pH value, we are using the following conversion:

pH value           Concept Name

0-3                    Acidic pH

3-6                    Moderate pH

6-8                    Neutral pH

8-10                  Moderate to Basic pH

10-14                Basic pH

Otherwise just quote the value as stated in the article, e.g. high/low pH, elevated/decreased pH, etc.

Species: All

Aggregation File

File is along the same lines but this time we are looking at protein aggregation. In addition to aggregation, it also contains putative assertions for dimerisation, oligomerisation and hydrophobic interaction. As with the oxidation file, it contains an "Aggregation Gap" concept, which we will need to change to something more appropriate, e.g. Protein Aggregation, Protein Dimerisation,etc.

All of the same rules apply that were used in the oxidation corpus check that it's referring to protein aggregation, insert a valid relationship for any good assertions that don't have one, etc. Again, feel free to skip any assertions that you feel aren't very high quality, e.g. the word "particle" under "term 1" might refer more often to chromatography matrices rather than the formation of protein particles in solution.

Species: All 

Tip posla su naucna istrazivanja  tako da sve perspektivne kandidate pozivamo da probaju da odgovore na par pitanja na online testu sa kojima ce se sretati u svakodnevnom radu. Zeleli bismo i Vas da pozovemo da odgovorite na nasa pitanja.

Link za test se nalazi ovde:

Svrha testa je da se ustanovi razumevanje engleske literature i sposobnost resavanja problema. Test nije test znanja tako da mozete koristiti sva sredstva koja zelite da bi pronasli potrebne informacije a koja bi Vam bila dostupna u svakodgevnom radu. Vreme testa pocinje da tece odmah po otvaranju web linka. Po zavrsenom testu molim Vas pritisnite 'submit query'. Tek po pritisnuom 'submit query' infomacije ce biti sacuvane. Za vreme testa molim Vas da u browser window-u testa ne pritiskate back and forward zato sto ce se informacije izgubiti