Quickstart: Using CyVerse for a Shared Project¶
Quickstart: Using CyVerse Bisque for a Shared Project¶
Prerequisites¶
Downloads, access, and services¶
In order to complete this tutorial you will need access to the following services/software
Prerequisite Preparation/Notes Link/Download CyVerse account You will need a CyVerse account to complete this exercise CyVerse User Portal Bisque access Request access to Bisque on the user portal CyVerse User Portal
Platform(s)¶
The following CyVerse platform(s) can be used in a collaborative project:
Platform | Interface | Link | Platform Documentation | Quick Start |
---|---|---|---|---|
Data Store | GUI/Command line | Data Store | Data Store Manual | Data Store Guide |
Discovery Environment | Web/Point-and-click | Discovery Environment | DE Manual | Discovery Environment Guide |
Bisque | Web/Point-and-click | BisQue | BisQue Manual |
Input and example data¶
No example data are required for this quickstart.
Get started¶
- Any project members who will be using CyVerse should take a look at the Data Store Guide and the Discovery Environment Guide.
- Be sure that all project members register for CyVerse accounts at the CyVerse User Portal and request access to Bisque. This action will create a directory in their home folder called bisque_data.
Managing a shared Bisque project¶
Note
This quickstart is in progress. Please check back soon for the complete quickstart.
Managing a shared project with BisQue has some extra considerations, because image files are in Bisque are organized into Datasets. Images files that are organized and shared using the Data Store (e.g., using the DE interface) can still be accessed via the image browser in Bisque, but they will not be organized as in the Data Store (add link to more in Bisque manual). If the Bisque web viewer or API are not needed, shared imaging projects can be managed using the Data Store just as for Managing a regular group project. Follow the guide below to decide how best to manage your project.
Choose the management workflow that best fits your project¶
CyVerse calls itself the “Lego building blocuks of cyberinfrastructure”, which means there are dozens of ways to set up a shared imaging project. Below are some methods that we have found work well. If you have other suggestions, please let us know!
Manage image data through the Bisque UI¶
When is this a good option? * Project members need to preview images * Project members need to add graphical annotations * Project uses Bisque apps for analyzing images
Manage image data through the Data Store in shared folder¶
When is this a good option? * Project data are shared only with project members
Manage image data through the Data Store in Community Released folder¶
When is this a good option? * Most of the project data are public
Sharing image data with project members¶
We strongly recommend that a single person be in charge of data management. There should also be a single person (generally the PI) who has ownership of the project folders and who sets read and write permissions for others. This ensures continuity when people move on. The PI can give ownership to a data manager for setting permissions, but should maintain their own ownership as well.
The owner of a folder has the ability to delete or rename the folder and any of its contents. If project members are given write permission to the project folder, they will be able to create their own sub-folders which they will own. In this way, project members can control access to their own data.
Tip
Before beginning your project, make a plan for how to name files and organize datasets. Remember that like traditional folders, datasets in Bisque can be nested, so you can have one overarching dataset, with sub-datasets. Agree on which metadata are needed for each type of file, and set up protocols for adding metadata when files are uploaded.
Sharing of images can be done through the BisQue interface or the Discovery Environment via the data sharing feature or on the command line using iCommands. Project members also can upload and download data using the desktop application Cyberduck, but Cyberduck cannot be used for setting sharing permissions.
According to the CyVerse Data Policy, all users receive a default allocation of 100GB. Shared data is counted as part of the allocation of whoever owns the folder that contains it. To request an increase to your allocation, should that become necessary, use the allocation increase form. We expect that users hosting shared directories will need to request larger data allocations.
If your project needs a shared folder for data that that going to be public during the active research phase of the project (e.g., you want to share transcriptomes or draft genomes as they are created, before publication), you can request a Community Released Data Folder. Community Released folders are intended for public data, not for shared projects that are kept private among collaborators.
Additional information, help¶
Fix or improve this documentation
- Search for an answer: CyVerse Learning Center
- Ask us for help: click on the lower right-hand side of the page
- Report an issue or submit a change: Github Repo Link
- Send feedback: Tutorials@CyVerse.org
Goal¶
Learn the basic steps for setting up a collaborative project using CyVerse.
Prerequisites¶
Downloads, access, and services¶
In order to complete this tutorial you will need access to the following services/software
Prerequisite Preparation/Notes Link/Download CyVerse account Be sure that all project members register for CyVerse accounts at the CyVerse User Portal. Creating a CyVerse Account
Platform(s)¶
The following CyVerse platform(s) can be used in a collaborative project. Be sure all project members are familiar with them.
Platform | Interface | Link | Platform Documentation | Quick Start |
---|---|---|---|---|
Data Store | GUI/Command line | Data Store | Data Store Manual | Data Store Guide |
Discovery Environment | Web/Point-and-click | Discovery Environment | DE Manual | Discovery Environment Guide |
Tip
If your project is image based (i.e. you are sharing a lot of images), you may want to use Bisque to manage the data. If so, see the page on Managing a shared Bisque project.
Input and example data¶
No example data are required for this quickstart.
Get started¶
Make a plan. You should have a data management plan (DMP) before you begin your project. There are many resources on the web to guide you on how to create a DMP, including this CyVerse documentation.
The most important DMP elements for this quick start are to know how you will organize your files and folders, what metadata you will use, and to make sure you have a written SOP for managing data.
Tip
Before beginning your project, make a plan for how to name files and organize folders. Agree on which metadata are needed for each type of file, and set up protocols for adding metadata when files are uploaded.
Share data with project members¶
Decide who has access to the data¶
There should be a single person (generally a PI) who has ownership of the project folders and who sets read and write permissions for others. This ensures continuity when people move on. The PI can delegate responsibility for setting permissions to a data manager, but should maintain their ownership of all folders as well.
The owner of a folder has the ability to delete or rename the folder and any of its contents. If project members are given write permission to the project folder, they will be able to create their own sub-folders which they will own. In this way, project members can control access to their own data.
Warning
Anyone who has own permission on a folder can delete it or rename it!
For projects that are part of a single lab, we recommend that the PI create a CyVerse account, then create a project directory and share it with lab members. Specific sub-directories can be shared with specific lab members as desired.
For projects that are collaborations among multiple labs, one person should be create a project folder to share with all collaborators. Collaborators must decide among themselves who will host the main folder and who has read, write, and own permission for all folders.
Create folders and set permissions¶
Following the agreed upon folder hierarchy, create the base folders for your project and give read, write, or own permission to the appropriate project members
The sharing functionality the CyVerse Data Store can be used to share data among project members. This can be done through the Discovery Environment via the data sharing feature or on the command line using iCommands. Project members also can upload and download data using the desktop application Cyberduck, but Cyberduck cannot be used for setting sharing permissions.
Tip
According to the CyVerse Data Policy, all users receive a default allocation of 100GB. Shared data is counted as part of the allocation of whoever owns the folder that contains it. To request an increase to your allocation, should that become necessary, use the allocation increase form. We expect that users hosting shared directories will need to request larger data allocations.
If your project needs a shared folder for data that that going to be public during the active research phase of the project (e.g., you want to share transcriptomes or draft genomes as they are created, before publication), you can request a Community Released Data Folder. Community Released folders are intended for public data, not for shared projects that are kept private among collaborators. See more below under Publish data from a shared project.
Use metadata to organize your files¶
Folder hierarchies and file naming conventions can be an important part of managing a bit data project, but you cannot relay on them to make your data FAIR (findable, accessible, interoperable, and reusable). Good metadata not only makes your data FAIR, but make it easier for you to manage your project by making files searchable and understandable.
CyVerse offers advanced metadata features for managing your data. See more details an the CyVerse wiki metadata page
Share tools and analyses with project members¶
Projects can use CyVerse analysis platforms to develop and share analysis tools and workflows.
The Discovery Environment (DE) contains hundreds of application that can be used by projects. Apps can be chained together to form workflows in the DE. It is now possible for CyVerse users to integrate their own applications or any open source application into the DE, using Docker containers. Projects may create private apps and workflows, to be shared only with project members, and then make those apps public when they are ready.
In the DE, you can create a team (add link to documentation) and share apps with your team.
Atmosphere can be used to set up a virtual machine (VM) with project software, which can then be used by all project members. The VM can later be imaged (made permanent) and published along with the project.
If your project includes a lot of computationally intensive analyses, you should consider requesting an XSEDE allocation (for the U.S. national super-computer infrastructure) and setting up HPC workflows using tools such as Pegasus. Compute intensive jobs that are highly parellel can be run on the Open Science Grid (OSG) via the DE.
Publish data from a shared project¶
When you are ready to publish the results of your project, you should also publish the data to an appropriate repository. For sequence data, that is one of the INDSC repositories, such as NCBI’s SRA. Other data types can be published to general scientific repositories or to the CyVerse Data Commons. See Publishing your data through the CyVerse Data Commons.
Group projects that are using a Community Released Data Folder to share data pre-publication are encouraged to transition to fully published data (with a DOI) when the data are stable. At that point, data can move into the Data Commons repository in its own folder, or it can remain within the shared project folder, but project members will lose edit access to the dataset. For more questions on this option, contact doi@cyverse.org.
Additional information, help¶
Fix or improve this documentation
- Search for an answer: CyVerse Learning Center
- Ask us for help: click on the lower right-hand side of the page
- Report an issue or submit a change: Github Repo Link
- Send feedback: Tutorials@CyVerse.org