Page tree
Skip to end of metadata
Go to start of metadata

User Functionality

The current UoM service offering utilises some of Mediaflux' capabilities (functionality will grow over time). UoM currently offers a turn-key project capability:

  • Projects
    • A project namespace (think of it as a cloud-based file system) where you and your team can securely store and share your data collection
    • You can structure your project namespace however you wish
    • You can associate discoverable meta-data with the structure and with the files that you upload
  • Authentication
    • For University of Melbourne staff and students, you can login directly with your institutional credential
    • For other users, you can log in with a local account created for you
    • Login via the  Australian Access Federation is no longer supported since the AAF no longer supports the required enhanced SAML plugin
  • Authorisation
    • Whatever account you login with, it must have roles granted to it (executed by the Mediaflux support team) to enable your account to be authorised to access resources
    • Standard roles per project are created (admin, read/create/destroy, read/create, read) which can be assigned to project team members
  • Access Protocols - Projects can be accessed via
  • Data Movement
  • Data Sharing
    • Data can be shared with external users (who don't have accounts) via shareable links
    • Data can also be uploaded by external users (who don't have accounts) via shareable links
  • Encryption (discuss with RCS Data Team)
    • HTTPS and sFTP protocols support encrypted transfers
    • Files can be encrypted at the storage layer (protection against unauthorised access to system back end only). Currently only supported with the HTTPS protocol; other protocols will be available in the future
    • Selected meta-data can be encrypted (protection against unauthorised access to system back end only)
  • Data Redundancy
    • Mediaflux assets (the container of meta-data and data content (e.g. an uploaded file)) are versioned. Whenever your assets change (e.g. modify the meta-data or content) a new version is created.  Old versions are retrievable.
    • A second Mediaflux server runs at the Noble Park data centre. This is known as the Disaster Recovery (DR) server.
      • The DR server is not accessible by normal users and is configured in a more restricted network environment.
      • The DR server is not used as a fail-over - that is, if the primary system fails, we cannot switch over to operations using the DR server.
      • The redundancy process copies all asset versions from the primary server to the DR server. When a new asset version is created, that new version is sent to the DR server and attached to the appropriate asset.
      • Therefore, there are 2 copies of your data managed by Mediaflux (one on the primary system and one on the DR system).
      • Data that have been destroyed before they are backed up to the DR server cannot be recovered.
      • Data that have been destroyed on the primary server and that have been copied to the DR server are retrievable on request (an administration task).
      • There is no user-controlled process that can delete data on the DR server.
  • High Availability
    • The primary controller (the Mediaflux server that users log in to and interact with) is part of a High Availability pair. If one fails, the service can be moved to the other.

Other Relevant Operational Functionality

  • Database Backups
    • The database (the component that maintains all your meta-data and knowledge about assets (files)) on the primary controller server is exported and saved every three hours.
    • Those DB exports are further replicated (copied) to a second Mediaflux server at Noble Park referred to as the DR (Disaster Recovery) server.
    • These exports are retained for 2 weeks. This means that if the DB should become corrupted, the gap is 3 hours in which data may have arrived which is no longer known (it exists but the system would have no record of it)
    • The DB backups are further synced to the Noble Park DR system (when they are removed from the primary after 2 weeks they are also removed from the DR server)
  • Scalability
    • The primary system consists of a controller node (handling database transactions) and 2 IO nodes. The IO nodes are used to actually move data to and from the storage. More IO nodes can be added as needed.
      • the IO nodes are only utilised for the HTTPS protocol (SMB coming)
    • The underlying storage is provided via a highly scalable CEPH cluster. More nodes can be added to the cluster as needed.
    • The combination of the scalable Mediaflux cluster and scalable CEPH cluster provides a very extensible environment as our data movement needs grow
  • No labels