Data Submission¶
Data submission in OmicsDM is a three-step process:
- Request project(s) creation: Project creation is done by the OmicsDM admin
- Create dataset(s): Here you create the "folder" for your files + attach the pheno-clinical data
- Upload file(s): Here you can upload your molecular data files such as RNA-seq count matrices
Prerequisite: Request project(s)¶
Write an email to CNAG's 3TR helpdesk asking for the creation of a new project.
Either provide the following information in the email:
Information | Explanation |
---|---|
Project ID | A unique identifier for the project |
Name | The name of the project |
Owners | The users groups that should be able to create datasets in the project |
Description | A short description of the project |
Diseases | The diseases that should be selectable in the dropdown during dataset creation |
Logo Url | The URL to the logo of the project |
Dataset Visibility default | The default visibility of the datasets in the project |
Dataset Visibility Changeable | Whether the visibility of the datasets can be changed |
File Download Allowed | Whether the uploaded files can be downloaded by users not in the owners group(s) |
or download the project_template fill it out and attach it to the email.
Why use the project creation template?
- Handy when you want to ask for the creation of multiple projects
- Makes the life more easy for the admin creating the project(s).
Create Datasets and Attach Pheno-Clinical Data¶
Recommended nomenclature for the Pheno-Clinical data file
These guidelines help maintain a consistent, clear, and scalable file naming convention for clinical data files, including REDCap exports, QC’d data, and converted data formats.
General Format¶
Components¶
- [Project/StudyID]: Unique identifier for the project or study (e.g.,
Study123
,ProjectX
). - [Source]: Origin of the file:
REDCapRaw
: Raw export from REDCap.REDCapLabel
: Labeled CSV export from REDCap (dictionary embedded).REDCapDict
: The dictionary CSV export from REDCap.- [FileType]: Type of file:
CSV
: For CSV files.JSON
: For JSON files (BFF or PXF).DICT
: For dictionary files from REDCap.- [Version/Date]: Version number or a date string in
YYYYMMDD
format (e.g.,v1
,20231204
). - [QCStatus]: Quality control status:
Yes
: If QC has been performed.No
: If QC has not been performed.- [QCBy]: Identifier for the center performing QC (e.g.,
CNAG
,KI
,FPS
).
If no QC has been performed (QCNo
), omit this field. - [ConversionType]: Converted format (if applicable):
BFF
: Converted to Beacon v2 JSON format.PXF
: Converted to Phenopackets JSON format.
Omit this if the file is not converted.- .ext: File extension:
.csv
for CSV files.json
for JSON files
Examples¶
Raw CSV Export (No QC)
Study123_REDCapRaw_CSV_20231204_QCNo.csv
Labeled CSV QC’d by CNAG
Study123_REDCapLabel_CSV_20231204_QCYes_CNAG.csv
REDCap Dictionary QC’d by KI
Study123_REDCapDict_DICT_20231204_QCYes_KI.csv
Converted BFF JSON (QC by CNAG)
Study123_REDCapRaw_JSON_20231204_QCYes_CNAG_CONVBFF.json
Converted PXF JSON (QC by FPS)
Study123_REDCapLabel_JSON_20231204_QCYes_FPS_CONVPXF.json
No QC for a Converted BFF JSON
Study123_REDCapLabel_JSON_20231204_QCNo_CONVBFF.json
When the project is created, you can start creating datasets (the "folder" for your files)
-
Navigate to the dataset creation page:
- Click on "DATA SUBMISSION" in the navigation bar on the left
- Click on "Create new Dataset"
- Select the project you want to create the dataset in
-
Create the dataset(s):
- Fill out the table on the page manually or by uploading a filled out dataset_template.
- Click on "Browse" in the "Clinical File" column to select the corresponding pheno-clinical information file to be uploaded
- Optional: Click on "Browse" in the "Data Usage Policy File" column to upload a file containing specific data usage policy
- Click on "VALIDATE" to check if all mandatory fields are filled out
- Click on "SUBMIT" to create the dataset(s)
FAQ
I accidentally filled some fields wrong, can I correct them?
No, you cannot correct the fields after the dataset has been created. Please contact the helpdesk to explain the situation so the admin can correct the fields for you.
last change 2024-12-08 by Ivo Leist ¶
Upload Molecular Data File(s)¶
Tip: You can upload multiple files at once!
3TR specific
Here is the link to directly navigate to the file upload page:
3tr.gpap.cnag.eu/portal/#/submitfiles
-
Navigate to the file upload page:
- Click on "DATA SUBMISSION" in the navigation bar on the left
- Click on "Submit files"
-
Upload the file(s):
- Select the project you want to upload the file(s) to
- Select in the Dataset ID dropdown the dataset you want to upload the file(s) to or upload a filled out file_template.
- Click on "Browse" to select the file(s) you want to upload
- Click on "VALIDATE" to check if all mandatory fields are filled out
- Click on "SUBMIT" to upload the file(s)
FAQ
Does the upload start immediately as soon I have selected a file?
Eventhough it looks like the upload has started, the files are not uploaded until you click on "SUBMIT".
last change 2024-12-08 by Ivo Leist ¶
Can I pause the upload process and resume later?
No, the upload process is not resumable. If you e.g. close the browser tab, the upload process is interrupted and you have to start over.
last change 2024-12-08 by Ivo Leist ¶
After uploading I realised that there is a mistake in the file. Can I correct it?
No, you cannot overwite the uploaded file. Best is to upload the again with the correct data making sure it has the same name as the original file. The system will automatically version the file. To avoid confusion, you can mark the original file as "deleted" so the users only have access to the correct file.
last change 2024-12-08 by Ivo Leist ¶
I accidentaly uploaded a wrong file, can I delete it?
No, you cannot delete any uploaded file (refer to File deletion). Best is to mark the file as "deleted" so no one can access it. If the file contains sensitive data, please explain the situation in an email to the helpdesk.
last change 2024-12-08 by Ivo Leist ¶
3TR specific
Please do not see our data warehouse as a storage for your raw data. Only upload processed data that is ready for analyses.