Storage
From CKAN
Discussion of CKAN Storage support and its use in CKAN instances.
Contents |
Extension
Extensions#Storage_-_Integrated_File_Storage_for_CKAN
Overview [Version 2]
Motivations
- Google Storage & Amazon S3 are explicitly designed to handle this situation where many of our users want to upload files simultaneously into a big archive of files.
- We save bandwidth, and therefore save money.
- We don't expect our web server to cope well with multiple POST requests lasting for several minutes each.
- Webstorer can run on a separate machine, and use our API, rather than having privileged access to local storage
- We like service-oriented architectures and we like Steve Yegge's rant about Amazon's infrastructure.
- The add-resource process is tricky any way you spin it.
- UX demands that users fully upload a file before committing the resource
- It's always a two-stage thing; we might as well have them upload to the final destination first
- The Archiver remains simple.
- It is solely concerned with files linked-by-URL, and so doesn't need to be mentioned on this diagram :-)
Diagram
Overview [Version 1]
See also Data Storage Proposal.
Policy on Bucket Naming
Name buckets as:
- {CKAN site_id with '.' replaced by '-'}-storage
Why no dots? Google does not allow '.' in buckets without a lot of hassle.
Examples:
- ckan.net: ckan-net-storage
- test.ckan.net: test-ckan-net-storage