Brief
This 2-phase competition is part of the NASA Tournament Lab and hosted by NCBI (The National Center for Biotechnology Information), NCATS (The National Center for Advancing Translational Sciences) and NIH (National Institutes of Health). These institutions, in collaboration with bitgrit and CrowdPlat, have come together to bring you this challenge where you can deploy your data-driven technology solutions towards accelerating scientific research in medicine and ensure that data from biomedical publications can be maximally leveraged and reach a wide range of biomedical researchers.
Each phase of the competition is designed to spur innovation in the field of natural language processing, asking competitors to design systems that can accurately recognize scientific concepts from the text of scientific articles, connect those concepts into knowledge assertions, and determine if that claim is a novel finding or background information.
Part 1: Given only an abstract text, the goal is to find all the nodes or biomedical entities (position in text and BioLink Model Category).
Part 2: Given the abstract and the nodes annotated from it, the goal is to find all the relationships between them (position in text and BioLink Model Predicate).
*NOTE: The prizes listed will be awarded based on a competitorâs combined, weighted scores from both phases of the competition. Please see the Rules section for more information.*
The National Center for Advancing Translational Sciences (NCATS, a center of the National Institutes of Health):
NCATS, is conducting this challenge under the America Creating Opportunities to Meaningfully Promote Excellence in Technology, Education, and Science (COMPETES) Reauthorization Act of 2010. This challenge will spur innovation in NLP to advance the field and allow the generation of more accurate and useful data from biomedical publications, which will enhance the ability for data scientists to create tools to foster discovery and generate new hypotheses.
The National Center for Biotechnology Information (NCBI, part of the National Library of Medicine, a division of the National Institutes of Health):
NCBI intramural researchers and their collaborators have provided a corpus of annotated abstracts from published scientific research articles and knowledge assertions between these concepts, which will be provided to participants for training and testing purposes.
CrowdPlat (Project Company):
The LitCoin project was awarded to and is being managed by CrowdPlat under NASA's NOIS2 contract. Located in San Jose, California; CrowdPlat provides crowdsourcing solutions to medium to large scale enterprises seeking project execution through a crowdsourced talent pool.
Prizes
1st Prize ($ 35000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!2nd Prize ($ 25000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!3rd Prize ($ 20000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!4th Prize ($ 5000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!5th Prize ($ 5000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!6th Prize ($ 5000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!7th Prize ($ 5000)
The prize money displayed is the total prize for both phases of the LitCoin NLP Challenge. Please see the Rules section for more info!
Timeline
- 09 Nov 2021 Competition Phase-1 Starts
- 23 Dec 2021 Competition Phase-1 Ends
- 28 Dec 2021 Competition Phase-2 Starts
Data Breakdown
The goal of the first part of the LitCoin Challenge is to identify the location and type of biomedical concepts (entities) within a research paperâs title and abstract.
The location of the biomedical entities is determined by two âoffsetâ numbers, âoffset_startâ and âoffset_finishâ that indicate the index or position where the concept substring starts and the index or position where the concept substring ends, respectively. The string considered for these indexes is the concatenation of the title + abstract strings.
The number of entities in each abstract is not fixed.
The type of the biomedical entities comes from the BioLink Model Categories, and can be any of the following:
ă»DiseaseOrPhenotypicFeature
ă»ChemicalEntity
ă»OrganismTaxon
ă»GeneOrGeneProduct
ă»SequenceVariant
ă»CellLine
To properly understand these concepts it would be helpful to get familiar with the concept of biomedical ontologies:
[LINK]
There are the following files included in the dataset:
ă»abstracts_train.csv:
# abstract_id: ID of the research paper
# title: title of the research paper
# abstract: abstract of the research paper
ă»entities_train.csv: CSV file containing all the entities (as a pair of offset and a type) found in the abstracts that can be used for training.
# id:
# abstract_id:
# offset_start:
# offset_finish:
# type:
# entity_id:
ă»relations_train.csv: CSV file containing all the relations () found in the abstracts that can be used for training FOR THE SECOND PART OF THE COMPETITION.
# id:
# entity_id_1:
# entity_id_2:
# relation_type:
# novel:
ă»abstracts_test.csv:
# abstract_id: ID of the research paper
# title: title of the research paper
# abstract: abstract of the research paper
The evaluation metric for this problem is a modified version of the Jaccard Similarity Score:
ă»For an abstract_id A, a set of predicted concepts P and a set of actual original concepts O, the formula is: |P| + |O| - |PâO| / |PâO|, where â means intersection, â means union, || means length or amount of concepts and where matching concepts in the intersection is determined by having the same (or very similar) offsets and the same type.
The Jaccard Similarity Scores for each abstract are then averaged to return the final score.
Final competition results are based on competitorâs combined, weighted scores from both phases of the competition. Winners will be determined by a weighted average of scores from the two competition phases: 30% of the total score will be determined by problem statement 1 and 70% of the total score will be determined by problem statement 2.
*NOTE: WHEN YOU UPLOAD YOUR SUBMISSION, IT MIGHT APPEAR AN ERROR SAYING "INVALID FILE". PLEASE IGNORE IT AND CONFIRM IF YOUR SUBMISSION WAS SCORED.*
FAQs
Who do I contact if I need help regarding a competition?
If you have any inquiries about participating in this competition, please donât hesitate to reach out to us at [email protected]. For questions about eligibility or prize distribution, email NCATS at [email protected].
How will I know if Iâve won?
If you are one of the top seven winners for this competition, we will email you with the final result and information about how to claim your reward.
How can I report a bug?
Please shoot us an email at [email protected] with details and a description of the bug you are facing, and if possible, please attach a screenshot of the bug itself.
If I win, how can I receive my reward?
The money prize will be awarded by NIH/NCATS directly to the winner (if an individual) or Team Lead of the winning team (if a team). Please check rule number 7 for eligibility information. Prizes awarded under this Challenge will be paid by electronic funds transfer and may be subject to Federal income taxes. HHS/NIH will comply with the Internal Revenue Service withholding and reporting requirements, where applicable.
Rules
1. This competition is governed by the following Terms of Participation (âParticipation Rulesâ). Participants must agree to and comply with the Participation Rules to compete.
2. This competition consists of 2 problem statements, herein considered as competition sub-phases. Winners will be determined by a weighted average of scores from the two competition phases: 30% of the total score will be determined by problem statement 1 and 70% of the total score will be determined by problem statement 2.
3. The competition dates are detailed below:
Phase 1 Start Date: November 9th, 2021
Phase 1 Closing Date: December 23rd, 2021
Phase 2 Start Date: December 28th, 2021
Phase 2 Closing Date: February 28th, 2022
Submission (Final Source Code): March 11th, 2022
Winnerâs Announced: April 8th, 2022
4. Participants are allowed to participate in an individual capacity or as part of a team.
5. It is not allowed to merge teams midway through the competition.
6. Each participant may only be a member of a single team and may not participate as individuals and on a team simultaneously.
7. In order to participate in this competition and be eligible for the prize money, participants must be a U.S. citizen or a U.S. permanent resident. Non-U.S. citizens and non-permanent residents can participate as well, as a member of a team that includes a citizen or permanent resident of the U.S, or they can participate on their own. However, such non-U.S. citizens and non-permanent residents are not eligible to win a monetary prize (in whole or in part). Their participation as part of a winning team, if applicable, may be recognized when the results are announced. Similarly, if participating on their own, they may be eligible to win a non-cash recognition prize. Proof of citizenship and permanent residency will be required. For more information on competition eligibility requirements, please see https://ncats.nih.gov/challenges/litcoin
8. In the case of a team participation, all submissions must be made by the team lead.
9. The use of external datasets for the purposes of training is allowed, but submissions
must be generated using the test corpus provided.
10. During the competition period, participants will be allowed to submit a maximum number of 5 submissions per day. If participants exceed the set submission limit, the platform will be reset to allow additional 5 submissions the following day. Please keep this in mind when uploading a submission file. Any attempt to circumvent stated limits will result in disqualification.
11. Participants are not permitted to share or upload the competition dataset to any platform outside of competition. Participants that do not comply with the confidentiality regulations of the competition will be disqualified.
12. The top seven (7) winning participants will be eligible to receive a competition prize (ranked by performance) after we have received, successfully executed, and confirmed the validity of both the code and the solution (See 14.). In order to ensure that at least 7 participants may be awarded prizes, the top fifteen (15) individuals/teams will be asked to submit their source code for evaluation (see 13.).
13. Once potential competition winners are determined and our team reaches out to them, the top scoring participants must provide the following by March 11, 2022 for evaluation to be qualified as competition winner(s) and receive their prize:
Winning Model Documentation template filled in (this document is available on the âResourcesâ tab on the competition page)
b. All source files required to preprocess the data
c. All source files required to build, train and make predictions with the model using the processed data
d. A requirements.txt (or equivalent) file indicating all the required libraries and their versions as needed
e. A ReadMe file containing the following:
âą Clear and unambiguous instructions on how to reproduce the predictions from start to finish including data pre-processing, feature extraction, model training and predictions generation
âą Environment details regarding where the model was developed and trained, including OS, memory (RAM), disk space, CPU/GPU used, and any required environment configurations required to execute the code
âą Clear answers to the following questions:
- Which data files are being used?
- How are these files processed?
- What is the algorithm used and what are its main hyperparameters?
- Any other comments considered relevant to understanding and using the model
14. Solution submissions should be able to generate the exact output that gives the corresponding score on the leaderboard. If the score obtained from the code is different from whatâs shown on the leaderboard, the new score (which may be lower) will be used for the final rankings unless a logical explanation is provided. Please make sure to set the seed or random_state etc. so we can obtain the same result from your code.
15. Solution submissions will also be used to generate output based on a validation dataset, generated in the same manner with which the provided test and training sets were generated, which will be kept hidden from all participants, in order to verify that code was not customized for the provided dataset. This output will not be used to determine leaderboard position, but could be used to disqualify a participant from receiving a prize if the output is judged to be severely inaccurate by bitgrit, CrowdPlat and NCATS.
16. In order to be eligible for the prize, a competition winner (whether an individual, group of individuals, or entity) must agree to grant to the NIH an irrevocable, paid-up, royalty-free non-exclusive worldwide license to reproduce, publish, post, link to, share, and display publicly the submission on the web or elsewhere, and a nonexclusive, non transferable, irrevocable, paid-up license to practice or have practiced for or on its behalf, the solution throughout the world. For more detailed information, please visit http://ncats.nih.gov/challenges/litcoin.
17. Any prize awards are subject to verification of eligibility and compliance with these Participation Rules. Novelty and innovation of submissions may also affect the final ranking. All decisions of bitgrit, CrowdPlat and NCATS will be final and binding on all matters relating to this Competition.
18. Cash prizes will be paid directly by NIH/NCATS to the competition winners. In the case of a winning team, the money prize will be paid directly by NIH/NCATS to the Team Lead. Non-U.S. citizens and non-permanent residents are not eligible to receive a cash prize (in whole or in part). Their participation as part of a winning team, if applicable, may be recognized when the results are announced.
Prizes awarded under this Challenge will be paid by electronic funds transfer and may be subject to local, state, federal and foreign tax reporting and withholding requirements. HHS/NIH will comply with the Internal Revenue Service withholding and reporting requirements, where applicable.
19. If two or more participants have the same score on the leaderboard, an earlier submission will take precedence and be ranked higher than a later submission.
20. If you have any inquiries about participating in this competition, please donât hesitate to reach out to us at [email protected]. For questions about eligibility or prize distribution, email NCATS at [email protected].
Thanks for your submission!
We'll send updates to your email. You can check your email and preferences here.
My Submissions
Terms of Participation
Agreement regarding confidential information and competition rules
These Terms of Participation (âAgreementâ) are hereby entered into on the date of your participation conditional upon your agreement to these terms (âEffective Dateâ) between you (âParticipantâ), as a participant in the LitCoin NLP Challenge: Part 1 competition (the âCompetitionâ) hosted at bitgrit.net (the âCompetition Siteâ), and bitgrit Inc. (âbitgritâ).
IMPORTANT, READ CAREFULLY: Your participation in the Competition on the above Competition Site is conditional upon your comprehension of, compliance with, and acceptance of these terms. Please review thoroughly before accepting.
I. General Clauses
1. This competition consists of 2 problem statements, herein considered as competition sub-phases. Winners will be determined by a weighted average of scores from the two competition phases: 30% of the total score will be determined by problem statement 1 and 70% of the total score will be determined by problem statement 2.
2. Participants are allowed to participate in an individual capacity or as part of a team.
3. It is not allowed to merge teams midway through the competition.
4. Each participant may only be a member of a single team and may not participate as individuals and on a team simultaneously.
5. In order to participate in this competition and be eligible for the prize money, participants must be a U.S. citizen or a U.S. permanent resident. Non-U.S. citizens and non-permanent residents can participate as well, as a member of a team that includes a citizen or permanent resident of the U.S, or they can participate on their own. However, such non-U.S. citizens and non-permanent residents are not eligible to win a monetary prize (in whole or in part). Their participation as part of a winning team, if applicable, may be recognized when the results are announced. Similarly, if participating on their own, they may be eligible to win a non-cash recognition prize. Proof of citizenship and permanent residency will be required. For more information on competition eligibility requirements, please see https://ncats.nih.gov/challenges/litcoin
6. In the case of a team participation, all submissions must be made by the team lead.
7. Participants are not permitted to share or upload the competition dataset to any platform outside of competition. Participants that do not comply with the confidentiality regulations of the competition will be disqualified.
8. The top seven (7) winning participants will be eligible to receive a competition prize (ranked by performance) after we have received, successfully executed, and confirmed the validity of both the code and the solution. In order to ensure that at least 7 participants may be awarded prizes, the top fifteen (15) individuals/teams will be asked to submit their source code for evaluation.
9. Any prize awards are subject to verification of eligibility and compliance with these Terms of Participation. Novelty and innovation of submissions may also affect the final ranking. All decisions of bitgrit, CrowdPlat and NCATS will be final and binding on all matters relating to this Competition.
10. Cash prizes will be paid directly by NIH/NCATS to the competition winners. In the case of a winning team, the money prize will be paid directly by NIH/NCATS to the Team Lead. Non-U.S. citizens and non-permanent residents are not eligible to receive a cash prize (in whole or in part). Their participation as part of a winning team, if applicable, may be recognized when the results are announced.
Payments to winners may be subject to local, state, federal and foreign tax reporting and withholding requirements.
II. Clauses of Non-Disclosure
1. Confidential Information
(1) Confidential Information shall mean any and all information disclosed by bitgrit to the Participant with regard to the entry and participation in the Competition, including (i) metadata, source code, object code, firmware etc. and, in addition to these, (ii) analytes, compilations or any other deliverable produced by the Participant in which such disclosed information is utilized or reflected.
(2) Confidential Information shall not include information which;
(a) is now or hereafter becomes, through no act or omission on the Participant, generally known or available to the public, or, in the present or into the future, enters the public domain through no act or omission by the Participant;
(b) is acquired by the Participant before receiving such information from bitgrit and such acquisition was without restriction as to the use or disclosure of the same;
(c) is hereafter rightfully furnished to the participant by a third party, without restriction as to use or disclosure of the same.
2. Non-Disclosure Obligation
The Participant agrees:
(a) to hold Confidential Information in strict confidence;
(b) to exercise at least the same care in protecting Confidential Information from disclosure as the party uses with regard to its own confidential information;
(c) not to use any Confidential Information except for as it concerns the Purpose elaborated upon above;
(d) not to disclose such Confidential Information to third parties;
(e) to inform bitgrit if it becomes aware of an unauthorized disclosure of Confidential Information.
3. No Warranty
All Confidential Information is provided âas is.â None of the Confidential Information shall contain any representation, warranty, assurance, or integrity by bitgrit to the Participant of any kind.
4. No Assignment of Rights
The Participant agrees that nothing contained in this Agreement shall be construed as conferring, transferring or granting any rights to the Participant, by license or otherwise, to use any of the Confidential Information.
III Rights to Deliverables
1. Transferable rights / Licenses
In order to be eligible for the prize, a competition winner (whether an individual, group of individuals, or entity) must agree to grant to the NIH an irrevocable, paid-up, royalty-free non-exclusive worldwide license to reproduce, publish, post, link to, share, and display publicly the submission on the web or elsewhere, and a non-transferable, irrevocable, paid-up, royalty-free non-exclusive worldwide license to practice or have practiced for or on its behalf, the solution. For more detailed information, please visit http://ncats.nih.gov/challenges/litcoin.
2. Restrictions on Use
The Participant hereby agrees to not utilize Submitted Algorithms to or for businesses, business endeavors, products, or services in competition with bitgrit or with the Competition co-host.
3. Authorization of Non-compensatory Use
The Participant hereby authorizes and consents to bitgrit and/or relevant third parties utilizing, analyzing, altering, or further reauthorizing the use of the Submitted Algorithm(s) to other third parties and will not make claims or demands for monetary compensation in regard to the above purposes.
4. Representations and Warranties
The Participant hereby declares and warrants that the Participantâs, bitgritâs, and the related third partyâs use of the Submitted Algorithms does not violate or infringe upon the intellectual property rights, business secrets, or other rights of any other third party.
5. Warranty Against Exercising of Moral Rights
The Participant agrees to not exercise moral rights to bitgrit or to related third parties in regard to the Submitted Algorithms.
6. Rights Regarding Modified and Derivative Works
The Participant hereby agrees that Intellectual Property Rights and other rights regarding any modified or derivative works created from the Submitted Algorithms shall belong to the creator of that modified or derivative work.
Uploading a new submission file will overwrite the existing file.
Terms & Conditions
Competition Unavailable
Login
Please login to access this page
Join our newsletter
Our team releases a useful and informative newsletter every month. Subscribe to get it delivered straight into your inbox!
bitgrit will be your one stop shop for all
your AI solution needs
Services
Business
Contact Us
- Japan Office
- +81 3 6671 8256
-
Koganei Building 4th Floor,
3-4-3 Kami-Meguro,
Meguro City, Tokyo, Japan - UAE Office
-
DD-14-122-070, WeWork Hub 71 Al Khatem Tower,
ADGM Square Al Maryah Island, Abu Dhabi, UAE