Crane 1.0.0 released! | R-bloggers


[This article was first published on Open Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Publishing is an integral part of the data analysis process. Whether it’s in the form of
code, reports or technical documentation, at some point artifacts need to be shared. More often than
not, such artifacts are confidential and their access needs to be properly secured. There exist
solutions for at least some types of artifacts, but we needed one simple tool that
could help us with all the use cases we encounter in our daily practice, built with modern
technology and security standards.

Crane is a new open source product to host data science artifacts: data analysis reports,
documentation sites, or packages and libraries. It is an integral part of our open source suite to
build data science platforms and plays well with ShinyProxy and
RDepot.

Crane has been designed to comply with the strictest industry regulations in terms of security
and auditing and has been widely popular amongst our customers.

Why?

  • All of your data science artifacts are under strict authentication and authorization using
    modern protocols (OIDC)
  • Fine-grained access control is organized in an intuitive hierarchical tree
  • The artifacts can be pushed into Crane using an API (e.g. to automate report updates) or using
    a UI (for manual uploads)
  • Full audit logs are available to track operations on all files (e.g. for GxP purposes)
  • All configuration can be stored in Git and Crane fully supports infrastructure-as-code (IaC)

In the below sections, we dive deeper into the many features of Crane and examples of how Crane can
be used.

High security and compliance requirements

Crane is designed with high security and compliance requirements in mind.

It provides declarative authorization rules ensuring that only authorized users can access the
data.

app:
repositories:
protected_repository:
read-access:
users: [ jack, jeff ]
write-access:
groups: [ writer, author ]
authentication_required_repository:
read-access:
any-authenticated-user: true
publicly_available_repository:
read-access:
public: true

Authorization rules for write and read access are defined separately providing flexibility while
being explicit. Crane also supports using Portable Operating System Interface (POSIX) Access
Control List (ACL) to control access to specific files or directories in cases that require
additional security.

Support for multiple storage backends

Crane currently supports multiple storage backends, including S3 and local file system. This
allows users to store and access data at scale in the cloud or on-premises. Each repository
can have its own storage location.

Hosting R and Python package repositories

Crane can be used to serve R and Python package repositories, both within a company or
publicly accessible network. Because of the advanced access control, only users with the correct
permissions can access a repository. The native
R and
Python clients guarantees easy
installation of packages, such that security isn’t a burden for users.

Data science storage

As a data science storage solution, Crane stores all your data and only allows access by authorized
users. The data can be accessed using the web UI or HTTP API, allowing users to directly
browse and download the data.

In addition, the data can be accessed by data science applications as well. For example, in
order to store the underlying data for a Shiny app. Usually, the Shiny app has direct access to all
the data (for that specific app). However, sometimes different users can only access certain
datasets, in this case, the Shiny app can use the identity of the user to download the data from
Crane. This ensures that authorization to the data is verified by Crane (instead of the Shiny app)
and solves a long-standing issue in the data science web app space.

Upload methods

Users can upload new data using the web UI or HTTP API.

Using the HTTP API, uploads can be automated in the context of automated report publishing using
your favorite CI/CD technology or pipeline:

Logs for auditing

All access to data is logged in the (optional) audit log of Crane. This feature supports running
Crane in qualified and validated environments for data storage and/or analysis.

Bring your own web UI

Crane allows users to customize the look and feel of the app by providing a minimal Web UI that
doesn’t use CSS libraries. This way you can easily bring your own UI and use any CSS framework
you’re most comfortable with.

Tested

To ensure Crane can perform in high security settings the code base has been tested using
integration tests reaching a high code coverage of more than 70%.

Documentation and support

Full release notes can be found on the downloads page and updated documentation can be found on
https://craneserver.net. As always community support on this new release is available at

https://support.openanalytics.eu

Don’t hesitate to send in questions or suggestions and have fun with Crane and friends!





Source link

Related Posts

About The Author

Add Comment